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SlMMItr 


This  paper  describes  the  provisional  e<|uat1ng  of  Foras  and  P2  of  the  Air  Force  Officer 
Qualifying  Test  (AFOQT)  and  the  associated  analyses  In  preparation  for  Its  operational  Inplemn- 
tatlon  In  1987.  The  pre-1^>1eMntat1on  e<|uat1ng  was  necessary  (a)  to  check  the  adequacy  of  the 
Iteas  In  the  new  foras,  (b)  to  assess  the  slallarlty  of  the  new  foras,  and  (c)  to  establish 
conversion  tables  for  placing  scores  froa  the  new  test  on  the  aetiic  of  Fora  0.  Three  fbras  of 
the  AFOQT  (0,  Pi,  and  P2)  were  a<li1n1stered  to  about  3,400  allltary  subjects  at  11  Air  Force 
bases.  The  subjects  were  froa  Basic  Military  Training  School  (BMTS),  Air  Force  Reserve  Officer 
Training  Corps  (AFROTC),  and  Officer  Training  School  (OTS). 

Analyses  were  coaputed  at  the  Itea,  subtest,  and  coa|>os1te  levels  and  several  types  of 
equatings  were  coapleted.  The  distributions  of  Iteas  based  on  Itea  difficulty  end  Itea 
disc rial nation  were  slallar  across  foras,  but  not  Identical.  Equipercentlle  equatings  were  used 
to  produce  conversion  tables  for  Foras  P  for  provisional  use  prior  to  the  Initial  Operational 
Test  and  Evaluation  (lOTAE)  of  the  new  foras. 
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PREFACE 


Th«  Air  Force  Himn  Resources  Laboratory  (AFHRL)  Is  tasked  as  the  test  developoMnt 
agency  for  the  Air  Force  Officer  Qualifying  Test  (AFOQT)  fay  Air  Force  Regulation  35-8, 
Air  Force  Military  Personnel  Testing  Systea.  The  current  research  and  developaent  (RAD) 
effort  MS  undertaken  as  part  of  AFHRL's  responsibility  to  develop,  revise,  and  conduct 
research  In  support  of  the  AFOQT.  Mork  ms  accaag>11shed  under  Task  771918,  Selection 
and  Classification  Technologies,  which  Is  part  of  a  larger  effort  In  Force  Acquisition 
and  Distribution  Systees.  Ihe  study  ms  coa^leted  under  Mork  Units  77191847 
(DevelopMnt  and  Validation  of  Civilian  and  Monrated  Officer  Selection  Methodologies) 
and  77191824  (Officer  Itea  Pool  Developaent). 

The  authors  would  like  to  thank  their  colleagues  In  the  ManpoMr  and  Personnel 
Division  for  their  assistance  In  this  effort.  Mr.  Todd  Sperl  provided  adroit  assistance 
with  data  tabulation  and  analysis,  and  Dr.  Nalcola  Ree  provided  expert  advice  on  a 
variety  of  technical  Issues.  A  nuaber  of  colleagues  supported  this  effort  by  going 
on-site  to  serve  as  test  adalnistrators  or  proctors.  Specifically,  m  extend  our 
appreciation  to  ILt  Thoaas  0.  Arth,  Nr.  Roy  E.  Chollaan,  Nr.  Douglas  K.  Cowan.  Mr. 
Refugio  Gonzalez,  Jr.,  and  ATC  Bertrand  L.  Washer. 

The  authors  acknowledge  with  considerable  gratitude  the  assistance  of  Ns.  Doris  E. 
Black,  Nr.  Jaaes  L.  Frieaann,  A1C  Dave  Lawson,  Sgt  Dave  LeBrun,  and  Ms.  Suzanne  Farrell 
of  the  Inforaatlon  Sciences  Division.  AFHRL.  Their  efforts  Mre  Instruaental  to  the 
successful  accoaplIshMnt  of  the  data  analysis  phase  of  this  study. 

Thanks  are  also  expressed  to  the  aany  operational  aanagers  and  training  staff 
aeabers  associated  with  the  Air  Staff,  the  Air  Force  Mllltaiy  Personnel  Center  (AFMPC), 
Air  Training  Coaaand  (ATC),  Air  Force  Reserve  Officer  Training  Corps  (AFROTC),  Basic 
Military  Training  Center  (BNTC),  and  Officer  Training  School  (OTS).  Managers  In  these 
organizations  mde  It  possible  for  the  testing  to  take  place,  despite  Inconvenience  to 
ongoing  training  which  was  often  considerable.  Nueerous  training  staff  personnel 
throughout  the  Continental  United  States  (CONUS)  provided  the  on-site  assistance  which 
was  essential  to  the  successful  data  collection.  When  necessary,  they  even  assisted 
AFHRL  staff  with  proctoring.  We  also  appreciate  the  assistance  of  the  thousands  of 
cadets  and  basic  mllltaiy  trainees  who  took  the  various  forms  of  the  AFOQT. 

Finally,  we  wish  to  thank  the  staff  of  Psychometrics,  Inc.,  especially  Drs.  Ray  and 
Frances  Berger,  and  Or.  Wllla  Gupta,  who  did  such  an  excellent  Job  In  developing  the 
best  possible  AFOQT  Forms  P  under  the  constraint  that  the  forms  be  parallel  In  content 
and  format  to  Form  0. 
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AIR  FORCE  OFFICER  QUALIFYING  TEST  (AFOQT): 
FORMS  P  PRE- IMPLEMENTATION  ANALYSES  AND  EQUATING 


I.  INTRODUCTION 

Background  of  ttw  Air  Forf  OffIcT  Test  (AFOOT) 

The  United  States  Air  Force  currently  selects  officers  froa  three  applicant  pools.  One  pool 
consists  of  highly  qualified  high  school  graduates  who  are  accepted  on  the  basis  of  Congressional 
recoMBendatlons  and  other  criteria  Into  the  United  States  Air  Force  Academy  (USAFA)  at  Colorado 
Springs,  Colorado.  After  coapleting  a  4-year  college  prograa.  graduates  enter  the  Air  Force  as 
second  lieutenants.  Since  the  Scholastic  Aptitude  Test  Is  used  as  the  prlaary  selection  tool  for 
these  Individuals,  they  are  not  required  to  take  the  Air  Force  Officer  Qualifying  Test  (AFOQT) 
for  selection  purposes.  The  second  pool  of  applicants  enters  the  Air  Force  through  the  Air  Force 
Reserve  Officer  Training  Corps  (AFROTC).  These  Individuals  attend  universities  and  colleges 
throughout  the  nation  and  enroll  In  AFRQTC  courses  In  their  last  2  years  of  schooling.  The 
■ajority  take  the  AFOQT  as  high  school  seniors  or  before  their  Junior  year  of  college.  The  third 
pool  of  applicants  consists  of  aen  and  woaen  who  have  coapleted  a  baccalaureate  degree  at  an 
accredited  university  or  college  and  apply  for  Officer  Training  School  (OTS).  These  Individuals 
take  the  AFOQT  for  selection  Into  OTS  and  are  coaalssloned  upon  coapletlon  of  the  OTS  prograa. 

Although  the  selection  of  aircrew  aeabers  dates  back  to  World  War  I,  the  first  screening  test 
for  prellalnary  selection  of  officers,  the  Aviation  Cadet  Qualifying  Exaalnatlon,  was  published 
In  1942.  Various  Iterations  of  the  original  test,  with  different  naaes,  were  used  for  selection 
screening  over  the  next  decade,  coapleaented  by  the  Aircrew  Classification  Battery  (ACB).  These 
Instruaents  underwent  considerable  change  during  this  era  of  experl aentatl on.  By  1952,  a 
prellalnary  version  of  the  AFOQT  was  developed  and  by  1955,  the  AFOQT  replaced  the  ACB  and  Its 
screening  test  predecessors.  Since  that  tiae,  the  AFOQT  has  been  updated  periodically.  Although 
Iteas  have  changed  and  subtests  have  been  added  or  deleted,  the  coaposite  structure  of  the  AFOQT 
has  reaalned  rather  constant  through  the  years.  The  experi aentatl on  of  the  1940s  and  early  1950s 
gave  way  to  evolutionary  refineaent  In  aore  recent  decades.  Interested  readers  should  consult 
Rogers,  Roach,  and  Short  (1965)  for  Inforaatlon  about  the  selection  of  coealssloned  officers  and 
a  brief  history  of  testing  of  Air  Force  officers. 

Recent  foras  of  the  AFOQT  have  been  draaatically  shortened,  and  the  subtest  structure  has 
been  aodifled.  Fora  N  of  the  AFOQT,  lapleaented  In  1978,  consisted  of  605  Iteas  divided  Into  18 
subtests  (Gould,  1978).  These  subtests  were  used  to  coapute  the  following  five  coaposite 
scores:  Pilot,  NavIgator^Technlcal ,  Officer  Quality,  Verbal,  and  Quantitative.  In  contrast, 
with  operational  lapleaentatlon  of  AFOQT  Fora  0  In  1981,  substantial  changes  In  content,  foraat, 
adalnistratlon,  and  scoring  were  aade  (see  Rogers,  Roach,  i  Wegner,  1986  for  details).  Fora  0 
consists  of  380  Iteas  (226  fewer  than  Fora  N)  and  Is  divided  Into  16  rather  than  18  subtests. 
Although  the  coaposites  are  slallar  (Officer  Quality  was  renaaed  Acadealc  Aptitude),  four 
subtests  were  dropped  and  two  new  ones  were  added.  Furtheraore,  the  aaount  of  tiae  required  for 
adalnistratlon  was  reduced  froa  about  7  hours  to  4.5  hours.  Table  1  shows  the  nuaber  of  Iteas  In 
each  subtest  and  how  the  subtests  are  arranged  Into  the  five  coaposites  for  Foras  N  and  0. 
Because  the  nuaber  of  Iteas  and  subtest/coaposite  structure  of  Foras  P  (discussed  In  the  section 
which  follows)  are  so  slallar  to  previous  foras,  the  coapositlon  of  Foras  P  Is  also  shown  In 
Table  1. 


Table  1.  Itea.  Subtest,  ami  Ccaposfte  Structure  for 
ilFOQT  Forat  N.  0.  and  P 
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“0  P  -  0  and  P 
CM  -  N  Only. 


Dtvtloptnt  of  <F0qT  Fort  P 

Historically,  the  Air  Force  Hump  Resources  Laboratory  (AFHRL)  has  been  responsible  for 
periodic  updates  of  the  AFOQT,  Including  the  new  Foras  Pi  and  Pz  becaw  operational  In 

June  1987.  In  support  of  this  responsibility.  AFMtL  contracted  with  Psychoaetiics.  Inc.,  of 
Sheraan  Oaks.  California,  to  develop  a  large  pool  of  Iteas  In  content  areas  already  covered  by 
the  AFOQT.  Froa  the  extensive  pool  of  Iteas  developed  In  each  of  the  existing  content  areas. 
AFHRL  and  Psyehaaetiics,  Inc.  selected  Iteas  to  be  used  In  conjunction  with  existing  Iteas  froa 
previous  foras  to  create  two  parallel  versions  of  Fora  P. 

These  previous  Iteas  which  are  coann  across  Fora  0  and  Foras  P  are  referred  to  as  anchor 
Iteas.  They  link  the  three  foras  for  experiaental  purposes,  and  were  selected  on  the  basis  of 
their  perforaance  with  officer  applicant  sanples.  Enpirical  data  also  provided  the  basis  for 
developing  and  selecting  new  Iteas  for  AFOQT  Foras  P.  New  Iteas  were  coablned  with  anchor  Iteas 
In  several  sets  of  experiaental  booklets  and  adninistered  priaarily  to  almen  In  Basic  Military 
Training  School  (BNTS).  In  soae  Instances,  especially  for  difficult  subtests,  experiaental 
booklets  were  also  adninistered  to  OTS  cadets.  Iteas  were  evaluated  using  classical  itea 
analyses.  Officer  Itea  difficulty  estlaates  were  generated  to  suppleaent  actual  difficulty 
Indices  obtained  froa  the  alraen  saaples.  New  Iteas  aeeting  a  variety  of  psychoaetrfc  criteria 
for  difficulty,  discrialnatlon.  and  content  were  selected  for  Inclusion  In  the  new  AFOQT  Foras  P. 


Rationale  for  the  Current  Investigation 

Despite  the  extensive  research  perfomed  In  the  developnent  of  AFOQT  Forms  P.  the  current 
Investigation  was  a  necessary  adjunct.  The  adequacy  of  the  Iteas  which  coaq>r1se  Foras  P  had 
already  been  assessed,  but  needed  to  be  checked  using  data  froa  a  aore  representative  saaple 
coaposed  priaerlly  of  officer  candidates  rather  than  alraen  trainees.  In  addition,  the  new  Foras 
P  had  to  be  coapared.  not  only  with  each  other  but  with  Fora  0,  to  deteralne  hm  parallel  they 
were,  and  equating  analyses  had  to  be  conducted  to  provide  conversion  tables  linking  scores  on 
the  new  foras  with  those  on  the  previous  fora  of  the  AFOQT.  Thus,  the  goals  of  this 
Investigation  are  threefold:  (a)  to  verify  the  adequacy  of  Iteas  In  Forms  P  using  a  more 
representative  sample;  (b)  to  compare  Forms  0.  P-].  and  Pz  To  deteralne  If  they  are  parallel, 
as  designed;  and  (c)  to  derive  scores  on  Foras  P  that  are  comparable  to  scores  on  Fora  0. 


Duteralning  the  Adequacy  of  Iteas  In  Forms  P  Using  a  Wrre  Representative  Saaple 

Test  construction  procedures  used  to  develop  Fom  P  were  designed  to  Identic  Items  which 
were  psychoaetrically  sound.  However,  as  Indicated  previously,  the  piiaary  empirical  basis  for 
Judgments  concerning  the  adequacy  of  Items  was  analyses  performed  on  data  obtained  froa  airmen 
saaples.  Reliance  on  data  froa  alraen  subjects  In  the  development  of  Items  for  use  with  officer 
applicants  Is  not  Ideal.  Obviously,  the  saaple  on  which  Items  were  developed  Is  not 
representative  of  the  target  population  to  be  tested.  Differences  In  age.  education,  and 
aptitude  aay  Halt  the  general  Izabllfty  of  the  data  obtained. 

Considering  the  drawbacks  of  using  alraen  sanples,  the  rationale  for  their  use  to  screen 
candidate  Items  for  the  AFOQT  needs  to  be  explained.  A  huge  Item  pool  was  developed  which  needed 
to  be  administered  to  a  larger  number  of  subjects  than  was  available  (without  considerable  time 
and  expense)  froa  the  pool  of  officer  candidates.  For  each  of  the  16  content  areas  In  Forms  0 
and  P,  300  new  Items  were  developed.  Each  of  these  Items  was  administered  to  at  least  3S0 
subjects.  Considering  the  magnitude  of  the  Item  development  task  and  the  limited  supply  of 
officer  candidates,  the  use  of  airmen  samples,  augmented  only  occasionally  with  officer 
candidates,  was  a  logistical  and  economic  necessity.  However,  it  was  also  necessary  to  confirm. 
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prior  to  operational  1«p1e«entat1on.  that  the  Iteas  selected  for  Foms  P  performed  well  when 
tested  on  a  mre  representative  saaiple.  If  they  did  not,  then  adjustments  could  be  made  In  the 
test  prior  to  final  printing  and  operational  use. 

CompaHng  Forms  0.  Rj.  and  P2  to  Determine  If  They  Are  Parallel 

A  design  goal  In  developing  the  two  new  versions  of  Form  P  was  to  construct  parallel  tests 
which  were  also  parallel  to  Form  0.  Briefly,  the  general  procedure  used  was  to  match 
psychometric  characteH sties  (based  largely  on  airmen  samples)  of  Items  occupying  the  sane 
position  on  Forms  0,  P^,  and  P2.  A  second  objective  of  the  current  Investigation  was  to 
determine  the  actual  degree  of  parallelism  among  the  three  forms  based  primarily  on  officer 
samples. 


Deriving  Scores  on  Forms  P  That  Are  Comparable  to  Scores  on  Form  0 

As  discussed  above,  parallelism  In  tests  Is  a  design  goal  which  can  be  attained  only 
Imperfectly.  Thus,  despite  the  parallel  design  of  the  foms,  scores  on  Forms  P^  or  P2  would 
not  be  exactly  equivalent  to  the  same  score  on  previous  forms  without  further  equating.  However, 
as  Angoff  (1971)  has  discussed,  techniques  exist  which  allow  scores  derived  from  different  forms, 
after  conversion,  to  be  directly  equivalent.  Thus,  the  third  objective  of  this  Investigation  was 
to  perfom  equating  analyses  which  would  provide  an  empirical  basis  for  generating  two  separate 
sets  of  provisional  conversion  tables  (one  set  for  Fom  P-),  the  other  set  for  Form  P2)  to 
link  scores  on  these  tests  to  scores  on  Fom  0.  These  provisional  conversion  tables  would  be 
modified.  If  necessary,  based  on  the  results  of  the  Initial  Operational  Test  and  Evaluation 
(lOTAE). 


II.  NETHOO 
Subjects 


Rationale  for  Subject  Selection 


Subjects  were  3,376  airmen  and  officer  students  In  BMT,  OTS,  and  AFROTC  who  were  adhilnlstered 
either  Fom  0,  Fom  P^  or  Fom  P2  of  the  AFOQT.  The  total  number  of  examinees  by  training 
program  and  AFOQT  fom  1$  shown  In  Table  2.  The  total  sample  size  was  reduced,  following  data 
cleanup,  to  a  computational  sample  of  3,341  cases.  Analyses  conducted  by  test  fom  were  based  on 
the  following  case  counts:  M  ■  1,101  for  Fom  0,  M  ■  1,120  for  Fom  R|,  and  N  ■  1,120  for  Fom 
P2«  Subjects  were  all  students  In  these  training  facilities  who  were  available  for  testing 
from  31  May  1986  to  18  July  1986.  The  timeframe  was  limited  by  the  need  to  prepare  conversion 
tables  In  time  for  the  target  operational  Implementation  date  and  to  detect  and  correct  problems, 
if  any.  In  the  Foms  P  booklets  prior  to  final  printing.  Start  and  stop  dates  were  based  on 
practical  considerations.  The  stop  date  allowed  collection  of  data  from  a  desired  minimum  of 
3,000  subjects,  and  also  provided  sufficient  tiM  for  data  analysis  and  Interpretation,  and  any 
final  modification  to  the  Foms  P  booklets  prior  to  the  printing  deadline.  Camera-ready  copies 
of  the  final  foms  were  due  on  1  October  1986  at  the  Air  Force  Military  Personnel  Center  (AFMPC), 
the  agency  responsible  for  administrative  oversight  of  the  operational  testing  program. 
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Table  2«  Nuaber  of  Exaafiwes  by  Training  Prograa  and  AFOQT  Fora* 


0 

Fora 

P) 

P2 

Total 

BMT  Airmen 

258 

256 

255 

769 

(23) 

(23) 

(23) 

(23) 

OTS  Cadet 

194 

194 

195 

583 

(17) 

(17) 

(17) 

(17) 

AFROTC  Cadet 

642 

667 

666 

1,975 

(58) 

(59) 

(59) 

(59) 

Unknown 

17 

16 

16 

49 

(2) 

(1) 

(1) 

(1) 

Total  N 

1,111 

1,133 

1,132 

3,376 

(100) 

(100) 

(100) 

(100) 

*Ce11  values  shovm  In  parentheses  below  Ns  are  percentages  of  the 
total  coluan  frequency. 


The  rationale  for  subject  selection  needs  to  be  elaborated.  For  the  pre-lapleaentatlon 
evaluation  described  In  this  paper,  a  major  goal  was  to  obtain  data  froa  sufficient  subjects  at 
a11  points  throughout  the  range  of  abilities  (or  perforaances,  as  measured  by  test  scores)  to 
generate  prellalnary  conversion  tables.  Three  groups  (I.e.,  BNTS  students,  AFROTC  cadets,  and 
OTS  cadets)  were  selected  for  participation  since  they  were  expected  to  score,  on  the  average,  at 
different  points  along  the  score  continuum  for  the  various  subtests  or  cooposltes.  Due  to  age 
and  educational  level,  8MT  airmen  were  expected  to  provide  scores  primarily  at  the  lower  end  of 
the  continue,  whereas  AFROTC  subjects,  who  were  still  In  school,  were  expected  to  "fill  In"  the 
middle  range.  OTS  subjects,  who  had  coapleted  their  baccalaureate  degrees,  were  expected  to 
score  at  the  higher  ranges. 


Procedures  for  Subject  and  Site  Selection 


Both  BMTS  and  OTS  are  located  at  Lackland  Air  Force  Base,  San  Antonio,  Texas,  which  Is  also 
the  site  of  the  AFHRL  testing  facility.  AFROTC  facilities  are  scattered  at  many  colleges  and 
universities  throughout  the  country.  However,  during  the  data  collection  period,  AFROTC  subjects 
were  temporarily  assigned  to  11  field  training  sites.  This  permitted  representative  sampling  of 
AFROTC  students  at  considerable  savings  of  time  and  travel  expenses.  The  sites  Involved,  and  the 
numbers  and  types  of  subjects  tested,  are  provided  In  Table  3. 

Demographic  Characteristics  of  Subjects 

The  following  description  of  demographic  characteristics  Is  based  on  the  computational  sample 
of  3,341  subjects. 1  Most  subjects  were  males  (2,689  or  BIX);  645  or  19X  were  females.  Most 
were  white  (2,808  or  84X)  while  286  (9X)  were  black,  and  7X  of  other  ethnic  origin.  Ages  ranged 
from  17  years  to  34  years,  with  the  majority  (76X)  being  22  years  of  age  or  younger.  Education 
ranged  from  12  to  21  years,  with  most  subjects  having  had  some  college.  Only  19X  (n  -  514)  had 
12  years  of  education,  while  80X  had  between  13  years  and  16  years  of  education.  Educational 
credentials  ranged  froa  high  school  diplomas  to  masters'  degrees.  However,  793  or  24X  had  an 
associate  or  baccalaureate  degree.  Only  IX  had  been  awarded  a  master's  degree. 


^Due  to  missing  demographic  data  on  some  cases,  the  frequencies  may  not  sum  to  3,341. 
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Table  3.  Distribution  of  Exaalnee  Categories  by  Testing  Slte^ 


_ Test  Fone _ 

0  P] _ ^ 


Basic  Alroen 
Lackland  AFB  (LAFB) 

Officer  Training  School  Cadets 
Medina  Annex  of  LAFB 

AFROTC  Cadets 
NcChord  AFB 

McClellan  AFB 

Tyndall  AFB 

Robins  AFB 

Dover  AFB 

UrIght'Patterson  AFB 
Plattsburgh  AFB 
McConnell  AFB 
Vandenberg  AFB 
Bergstroe  AFB 
Lackland  AFB 
Unknoun 
Total  N 


258 

256 

255 

(23) 

(23) 

(23) 

194 

194 

195 

(17) 

(17) 

(17) 

59 

64 

63 

(5) 

(6) 

(6) 

39 

44 

37 

(4) 

(4) 

(3) 

28 

29 

29 

(3) 

(3) 

(3) 

38 

38 

37 

(3) 

(3) 

(3) 

37 

37 

36 

(3) 

(3) 

(3) 

31 

37 

37 

(3) 

(3) 

(3) 

60 

62 

61 

(5) 

(5) 

(5) 

69 

71 

72 

(6) 

(6) 

•  (6) 

102 

100 

100 

(9) 

(9) 

(9) 

29 

32 

31 

(3) 

(3) 

(3) 

150 

153 

163 

(14) 

(14) 

(14) 

17 

16 

16 

(2)  (1)  (1) 

1,111  1,133  1,132 

(100)  (100)  (100) 

^Cell  values  shown  In  parentheses  below  Ns  are  percentages  of 
the  total  coluan  frequency. 


A4rin1 strati ve  Procedures 


Testing  at  AFROTC  Field  Training  Sites 

A  testing  schedule  was  arranged  that  would  allow  two-person  team  froa  AFHRL  to  make  five 
trips  of  3  to  5  days'  duration,  usually  to  Multiple  sites.  These  trips  were  scheduled 
sequentially  to  ensure  that  sufficient  materials  would  be  available  for  adbilnl  strati  on.  Team 
composition  varied  from  trip  to  trip.  The  AFOQT  forms  were  actalnlstered  on  days  available  In  the 
field-site  training  schedule.  Including  Saturday  and  Sunday.  Total  testing  time  was 
approximately  4  1/2  hours,  excluding  Initial  preparation  and  clean-up.  Once  testing  was 
completed  at  one  site,  AFHRL  teams  typically  had  at  least  a  day  to  travel  to  the  next  site,  make 
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final  arrangaBents,  and  orient  on-site  personnel  who  assisted  with  proctoring.  Wherever 
possible,  single  eoming  and/or  afternoon  sessions  were  conducted  with  the  AFHRL  teaa  serving  as 
test  administrator  and  lead  proctor.  Large  rooas,  such  as  a  large  testing  roo«,  the  ballrooa  of 
an  officer's  club,  or  a  recreation  center  were  used.  Although  there  were  some  unavoidable 
variations  in  the  quality  and  configuration  of  facilities  fro*  site  to  site,  care  was  taken  to 
ensure  that  the  tast  adeinistration  environaent  was  as  standardized  as  possible  and  adequate  for 
adiinistration  of  the  AFOQT.  In  a  few  instances,  two  adaini  strati  on  sessions  were  conducted 
similtaneously,  with  both  AFHRL  team  menbers  serving  as  test  adaini strators.  assisted  by  on-site 
proctors. 


Testing  at  BUT  and  OTS  Sites 

Since  BUT  and  OTS  training  facilities  are  collocated  at  Lackland  AFB  with  the  AFHRL  testing 
facility  and  staff,  arrangewnts  for  testing  at  these  sites  did  not  involve  extensive  travel  and 
required  no  proctortng  assistance  fro*  the  training  staffs  of  these  schools.  BMT  subjects  were 
tested  in  their  own  facilities,  the  AFHRL  facilities,  or  so*e  other  suitable  facili^  at 
Lackland.  OTS  subjects  were  tested  in  the  OTS  auditoriu*.  Due  to  the  unavailability  of  4  1/2- 
to  5-hour  periods  of  ti*e  in  their  training  schedule,  OTS  subjects  were  adaini stered  the  AFOQT  in 
two  2  1/2-hour  seg*ents.  This  deviation  fro*  the  procedures  used  with  the  other  groups  was 
unavoidable.  In  other  respects,  testing  procedures  were  as  si*11ar  as  possible  to  those  used 
with  AFROTC  subjects,  except  that  AFHRL  staff  perfonaed  all  adainistrator  and  proctor  duties. 


Deve1op*ent  of  a  Manual  for  Adartnistration 

To  ensure  effective  and  consistent  adaini  strati  on  of  the  aailtiple  AFOQT  fonas  during  this 
investigation,  a  separate  Manual  for  Adaini  strati on  was  prepared.  The  existing  Fona  0  Manual  for 
Adaini strati on  was  being  revised  in  preparation  for  developaent  of  a  Fonas  P  Manual  for 
Adaini  strati  on.  The  Fona  0  version  was  adapted  for  operational  use  in  the  current  investiga¬ 
tion.  Changes  were  *ade  so  adainistrators  would  alert  exaarinees  to  the  need  to  identify 
color-coded  test  booklets  as  one  of  the  three  fonas  on  the  answer  sheet,  and,  in  the  case  of 
Fonas  P,  to  identify  the  correct  version  maaber.  Whenever  necessary,  changes  were  also  made 
(a)  to  identify  across-fonas  differences  in  the  wording  of  directions  or  other  attributes  of  the 
tests  or  their  adainistration,  (b)  to  reflect  the  exper1*ental  nature  of  the  testing,  and  (c)  to 
alert  adainistrators  to  read  a  separate  Privacy  Act  State*ent  suitable  to  the  experimental  nature 
of  the  testing  sessions. 


Selection  and  Training  of  Adainistrators  and  Proctors 

To  ensure  that  test  adainistration  procedures  were  as  standardized  as  possible,  considerable 
emphasis  was  placed  on  the  selection  and  training  of  administrators  and  proctors.  This  was 
especially  important  since  the  test  was  being  adainistered  by  multiple  adainistrators/proctors  in 
multiple  settings.  Test  adainistrators  and  proctors  were  either  psychologists  with  a  good 
understanding  of  psychometric  principles  or  experienced  test  adainistrators.  Nearly  all  had 
prior  test  adainistration  experience.  Those  with  the  most  experience,  regardless  of  rank,  were 
assigned  as  test  adainistrators  in  the  adaini strator/proctor  teams. 

Training  materials  were  developed  by  AFHRL  scientists,  and  a  1/2-day  training  session  was 
held  in  which  AFHRL  staff  members  responsible  for  administration  or  proctoring  participated. 
Topics  involved  AFOQT  test  adainistration,  orientation  and  training  of  on-site  personnel,  setting 
up  of  the  testing  room,  distribution  of  test  materials,  safeguarding  of  the  tests  and  answer 
sheets,  and  preliminary  data  checks. 
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To  help  ensurt  that  on-site  AFROTC  personnel  were  prepared  to  assist  with  proctoring.  Air 
Force  regulations  relevant  to  testing  were  forwarded  In  advance  to  the  field  sites,  along  with  a 
description  of  the  specific  duties  of  test  proctors.  In  addition,  AFHRL  test  adalnlstrators  were 
Instructed  to  review  proctoring  duties  with  proctors  upon  arrival  at  the  site,  and  AFHRL  tea* 
MMbers  were  requested  to  supervise  on-site  proctors  during  the  testing  sessions. 


Use  of  TWO  Different  Answer  Sheets 


Prior  to  each  testing  session,  answer  sheets  were  placed  Inside  the  front  cover  of  each  test 
booklet.  Two  types  of  answer  sheets  were  used:  one  red,  the  other  green,  each  differing 
slightly  In  fonut  and  In  the  size  and  shape  of  the  response  ovals  or  bubbles.  The  rationale  for 
use  of  two  different  answer  sheets  requires  explication.  The  red  answer  sheet  has  been  used 
operationally  In  the  adalnistratlon  of  the  AFOQT  For*  0  to  AFROTC  cadets.  The  green  answer  sheet 
has  been  used  In  the  operational  adslnlstratlon  of  AFOQT  For*  0  to  OTS  applicants.  Two  different 

answer  sheets  had  been  used  In  obtaining  Font  0  data  since  scoring  had  been  decentralized  (at 

Maxwell  AFB  and  either  Brooks  AFB  or  Randolph  AFB),  and  the  scanning  equipaent  at  Maxwell  AFB  was 
unable  to  process  the  green  sheets.  The  nature  of  the  answer  sheet  can  affect  test  scores, 
especially  on  speeded  subtests,  as  pointed  out  by  Megner  and  Ree  (1965).^ 

In  collecting  For*  0  data.  It  was  necessary  to  use  both  answer  sheets  to  approxlaate  the 
previous  operational  practice.  Thus,  In  preparation  for  adalnlstratlon  at  each  site,  green 
answer  sheets  were  Inserted  In  For*  0  test  booklets  to  be  aMnlstered  to  OTS  cadets  and  red 

answer  sheets  In  test  booklets  to  be  adalnistered  to  AFROTC  cadets.  The  two  types  of  answer 

sheets  were  equally  divided  aaong  the  BNTS  exadnees.  Forms  P  bookleU  were  prepared  for 
adlnlstratlon  by  Inserting  only  green  answer  sheets,  consistent  with  the  new  operational  use  of 
a  single  type  of  answer  sheet. 


Adrinlstratlon  of  AFOQT  Forms  0  and  P 


Prior  to  entry  of  the  examinees  Into  each  testing  session,  test  adalnlstrators  and  proctors 
placed  a  copy  of  each  for*  (with  Its  answer  sheet  Insert)  In  sequential  order  (0,  P,,  P2)  at 
each  testing  station.  This  was  done  to  ensure  that  randomly  equivalent  groups  were  formed.  If 
multiple  testing  sessions  occurred  at  a  site,  distribution  of  the  booklets  was  counterbalanced  by 
starting  with  the  next  booklet  In  the  series  on  simultaneous  or  subsequent  testing  sessions. 
Thus,  If  the  last  booklet  distributed  In  one  session  was  a  Form  Pj  booklet,  the  first  booklet 
distributed  at  the  next  testing  session  was  a  Form  P2  booklet. 


Data  Analyses 


Scoring  and  Data  Editing 

In  order  to  provide  accurate  scoring  of  the  optical  scan  answer  sheets,  they  were  first 
checked  by  test  aMnIstrators  to  determine  If  they  were  suitable  for  scanning  and  to  correct 
problems  such  as  removing  stray  marks,  darkening  ovals,  etc.  Green  answer  sheets  were  then 
scanned  by  the  Technical  Services  Division  (TS)^  of  AFHRL,  while  red  answer  sheets  were 
forwarded  to  HQ  AFROTC  for  scanning. 


^0  avoid  such  potential  for  error  In  future  operational  use.  scortng  will  be  centralized 
at  AFMPC  at  Randolph  AFB  and  the  green  answer  sheet  will  be  the  only  one  used  with  the 
operational  Implementation  of  Forms  P. 

^Division  has  been  redesignated  the  Information  Sciences  Division. 


8 


Pomr  -  Spted  Issut 

Historically,  the  AFOQT  has  contained  subtests  which  have  been  described  as  conforalng  to 
speeded,  power,  or  nixed  nodels.  A  power  nodel  Is  one  In  which  all  exaninees  have  enough  tine  to 
consider  every  question  In  the  subtest.  A  speeded  test  Is  one  In  which  the  Itens  are  easy*  but 
there  Is  not  enough  tine  for  each  Individual  to  answer  every  Itan  (Gulllksen,  1950).  Therefore, 
successive  Itens  are  reached  by  fewer  and  fewer  exaninees  and  Individuals  responding  at  a  slower 
rate  have  tine  to  consider  fewer  Itens  than  those  responding  at  a  faster  rate.  Mixed  nodels  are 
those  that  have  Itens  written  as  In  a  power  test,  but  yet  exaninees  do  not  have  enough  tine  to 
answer  every  Iten.  Nixed  nodels,  then,  do  not  follow  either  a  pure  power  or  pure  speeded  node). 

The  Index  used  to  evaluate  whether  a  given  subtest  confoms  to  a  power,  speed,  or  nixed  nodel 
1$  the  proportion  of  subjects  not  responding  to  Itens  In  the  subtest  (I.e.,  proportion  of 
onitting).  Power  tests  characteristically  have  Itens  with  proportions  of  onlttlng  that  are  less 
than  .05.  When  the  Iten  nunber  Is  plotted  with  the  corresponding  proportion  of  onlttlng,  power 
tests  exhibit  a  flat  line.  True  speeded  subtests  have  low  onlttlng  rates  for  Itens  at  the 
beginning  of  the  subtest,  but  then  show  a  stea<ty  Increase  In  onlttlng  rates  for  Itens  In  the  last 
half  of  the  subtest.  Nixed  node)  subtests  are  defined  as  having  low,  flat  rates  for  the  najorlty 
of  the  ItesB,  but  have  an  Increase  In  onlttlng  rate  for  the  final  few  Itens. 

Whether  a  subtest  Is  best  described  as  power  or  speeded  has  two  Inpllcatlons.  One 
Inpllcatlon  concerns  the  Interpretation  of  test  results.  Not  having  an  appropriate  understanding 
of  the  nature  of  the  subtest  could  lead  to  nisinterpretatlon  of  an  exaninee's  knowledge  and 
abilities.  A  second  l^rllcatlon  Is  that  knowledge  of  degree  of  speededness  should  guide 
decisions  about  data  analysis,  for  exa^rle,  the  conputatlonal  fomulas  for  Iten  difficulty  and 
Iten  discrininatlon  Indices  differ  for  power  and  speeded  tests.  If  an  Inappropriate  analysis  Is 
used,  the  Iten  statistics  conputed  nay  underestinate  or  overestinate  the  'true*  statistics. 

Skinner  and  Ree  (1987)  categorized  the  16  AFOQT  subtests  according  to  the  nodel  to  which  each 
conformed.  Mechanical  Comprehension,  Rotated  Blocks,  and  General  Science  were  Judged  to  be  power 
subtests:  Electrical  Maze,  Instrument  Comprehension,  and  Block  Counting  were  classified  as 
speeded;  and  the  remaining  subtests  were  described  as  following  a  nixed  node).  This  pattern  Is 
similar  but  not  Identical  to  the  classifications  made  by  Gould  (1978)  and  Miller  (1974).  These 
two  studies  designated  Electrical  Maze,  Instrument  Comprehension,  Scale  Reading,  Table  Reading, 
and  Block  Counting  as  being  speeded.  Nonetheless.  It  should  be  noted  that.  In  the  Skinner  and 
Ree  data,  these  five  subtests  have  highly  speeded  components.  Later  In  this  paper,  the  degree  of 
speededness  for  the  subtests  In  Forms  0,  P^,  and  Pg  will  be  discussed. 


Classical  Iten  Analysis 

The  analysis  of  performance  of  Forms  P^  and  P^  at  the  Item  level  was  based  on  classical 
or  "true  score”  theory  (Gulllksen,  1950;  Henrysson,  1971).  Iten  difficulties  (£)  were  calculated 
as  the  proportion  of  exaninees  responding  correctly  to  the  Item.  The  biserlal  correlation 
(r|,{s)  between  the  Item  score  (correct  or  Incorrect)  and  total  subtest  score  was  used  as  the 
Index  of  the  discrimination  value  of  each  Item.  In  the  analysis  of  power  subtests,  the  level  of 
difficulty  Is  calculated  by  dividing  the  nunber  of  exaninees  selecting  the  correct  option  by  the 
number  of  exaninees  taking  the  subtest.  Since  for  power  subtests  It  Is  assumed  that  all  Items 
are  reached  by  all  exaninees,  the  number  of  people  attempting  each  Item  equals  the  nunber  of 
exaninees  taking  the  subtest.  In  contrast.  In  the  analysis  of  speeded  subtests,  the  total  nunber 
of  examinees  taking  the  test  Is  not  used  In  calculating  the  level  of  difficulty.  Rather, 
difficulty  Is  defined  as  the  nunber  of  exaninees  selecting  the  correct  option  divided  by  the 
nunber  of  examinees  who  make  a  response  to  the  Item  or  one  later  In  the  subtest.  Examinees  who 
do  not  finish  a  subtext  are  not  Included  In  the  analyses  of  the  Items  they  do  not  reach. 
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Subtwt  Mid  Ca«po$ttt  Analysis 

Subtest  and  eoaposlte  raw  scores  were  described  and  coapared  In  term  of  their  wan,  standard 
deviation,  skewness,  kurtosis,  and  reliability.  Proportion  correct  was  also  obtained.  Test 
reliability  was  coaputed  using  Kuder-RIchardson  Foraula  20  for  power  subtests.  Intercorrelatlons 
were  also  calculated  for  subtest  and  coaposlte  raw  scores. 


Equating  Design 

An  equivalent,  randoa  groups  design  (Angoff,  1971)  was  used  to  equate  Form  P]  and  P2  to 
Form  0.  Forms  P]  and  P2  were  constructed  to  be  parallel  to  each  other  and  Form  0  In  content, 
difficulty,  and  reliability.  Because  the  new  forms  were  content  parallel,  an  equating,  as 
opposed  to  a  calibration,  was  conducted. 

The  approach  taken  to  equate  the  new  forms  gave  consideration  to  the  use  of  different  answer 
sheets  In  the  operational  AFOQT  Form  0  testing  prograa.  An  Inspection  of  testing  load 
frequencies  Indicated  that  the  number  of  officer  applicants  examined  each  year  on  the  two  answer 
sheets  was  roughly  equal.  That  Is,  about  half  of  the  examinees  were  OTS  applicants  tested  on  the 
green  answer  sheet  and  the  other  half  were  AFROTC  applicants  tested  on  the  red  answer  sheet.  The 
proportion  of  examinees  by  answer  sheet  typo  observed  In  the  operational  program  needed  to  be 
preserved  In  the  equating  analyses  to  account  for  potential  effects  that  the  different  answer 
sheet  features  may  have  had  on  Form  0  performance.  The  answer  sheets  differed  not  only  In  color 
but  also  In  structural  features  such  as  response  grid  and  arrangewnt.  A  weighting  procedure  was 
devised  to  obtain  a  Form  0  distribution  for  equating  analyses  which  gave  equal  weight  to  the 
scores  of  subjects  In  the  current  sample  tested  on  either  answer  sheet.  As  described  in  a 
previous  section  of  this  paper,  fewer  of  the  Form  0  subjects  had  been  supplied  a  green  answer 
sheet  than  a  red  answer  sheet.  Therefore,  scores  for  the  remelning  subjects  were  weighted  by 
2.661  (the  ratio  of  mm*er  of  exMrinees  tested  on  rod  sheets  to  those  tested  on  green  sheets)  to 
yield  a  Form  0  distribution  In  which  the  scores  obtained  from  the  two  answer  sheet  types  were 
represented  In  equal  numbers. 

Linear  and  equi percentile  equatings  were  accomplished  as  described  by  Angoff  (1971,  pp. 
568-573)  for  each  co^MSlte  of  each  form  of  the  new  test.  In  the  linear  method,  on  equivalent 
forms  raw  scores  that  have  the  same  z-score  value  are  set  equivalent;  In  the  equipercentlle 
method,  raw  scores  that  have  the  same  percentile  rank  are  set  equivalent.  Since  the 
equipercentlle  method  may  produce  Irregular  equating  curves,  three  forms  of  smoothing  were 
conducted.  Linear,  quadratic,  and  cubic  polynomial  regressions  for  smoothing  were  used  with  the 
Form  P  scores  entered  as  the  Independent  variables  and  the  Form  0  scores  serving  as  the  dependent 
variables.  For  linear  smoothing  the  first  power  of  the  Independent  variable  was  entered  as  the 
Independent  variable  Into  a  multiple  regression  equation;  for  quadratic,  the  first  and  second 
powers  were  entered;  and  for  cubic,  the  first,  second,  and  third  powers  were  entered.  As  a 
result,  four  equatings  (I.e.,  one  linear  equating  and  three  smoothings  of  equipercentlle 
equating)  were  produced  for  each  composite  on  Form  Pi  and  four  for  each  composite  on  P2. 

Decisions  had  to  be  made  about  which  method  was  appropriate  for  each  composite.  The 
decisions  were  based  on  the  similarity  of  the  distribution’ of  the  equated  new  test  scores  (e.g.. 
Form  Pi)  with  the  distribution  of  the  reference  test  scores  (I.e.,  Form  0)  (Braun  6  Holland, 
1982).  The  method  of  equating  which  produced  the  greatest  similarity  In  the  two  distributions 
was  selected.  Several  statistical  Indices  of  goodness-of-fit  were  examined  to  distinguish  among 
the  equatings.  These  were  (a)  the  standard  error  of  estimate  for  polynomial  smoothing  techniques 
and  (b)  three  measures  of  deviation  between  raw  scores  (bias,  absolute  average  deviation,  and 
root  mean  square  deviation)  for  equipercentlle  versus  linear  (z-score)  equatings. 
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Ill.  RESULTS  AND  DISCUSSION 


Twelve  Iteas  were  previously  reaoveil  froa  Fora  0  due  to  double  keys,  alskqrs.  or  poor  Itea 
perforaance.  Three  Iteas  were  reaoved  froa  Verbal  Analogies,  four  Iteas  froa  Arlthaetic 
Reasoning,  two  froa  Data  Interpretation,  one  froa  Word  Knowledge,  one  froa  Mechanical 
Coaprehenslon,  and  one  froa  Scale  Reading.  Because  these  Iteas  were  reaoved  froa  the  current 
analyses,  the  nwber  of  Iteas  per  subtest  for  Fora  0  differs  froa  those  for  the  corresponding 
subtests  In  Foras  Pi  and  P2.  This  Influences  the  comparison  of  Fora  0  with  Foras  P]  and 
P2,  but  not  the  coaparlson  of  Pj  with  P2. 

The  Itea  oaltting  rates  for  Foras  0,  Pi,  and  P2  In  these  saaples  were  used  to  deteralne 
the  type  of  analysis  (power,  speeded,  or  alxed)  for  each  subtest.  While  It  Is  appropriate  to 
aake  these  deteral nations  for  analysis  purposes,  no  peraanent  reclassification  of  the  subtests  as 
speeded  or  non>speeded  Is  Imilled.  The  patterns  of  oaltting  In  the  saaples  were  consistent 
across  the  three  foras  for  each  subtest.  Mechanical  Coaprehenslon,  Rotated  Blocks,  and  General 
Science  conforaed  to  the  power  aodel,  whereas  Electrical  Maze,  Scale  Reading,  Instruaent 
Coaprehenslon,  Block  Counting,  and  Table  Reading  exhibited  highly  speeded  coaponents.  The 
reaalning  eight  subtests  showed  slightly  speeded  or  alxed-aodel  coaponents.  Therefore,  It  was 
decided  to  analyze  the  five  highly  speeded  subtests  using  the  speeded  coaputatlonal  foraulae  for 
Itea  difficulty  and  d1 serial  nation,  while  analyzing  the  reaalning  subtests,  even  If  slightly 
speeded,  as  power  subtests. 

In  discussing  the  results  of  this  Investigation,  general  conents  will  be  provided  first. 
Then  each  of  the  three  foras  will  be  discussed  Individually  In  the  following  order:  Fora  0,  Fora 
P],  and  Fora  P2.  Next,  coaparisons  will  be  aade  between  Fora  0  and  Foras  P^  and  P2. 
Finally,  coaparisons  will  be  aade  between  Foras  Pi  and  P2. 


Itea  Analysis 

Itea  Difficulty.  Itea  difficulty  Is  based  on  the  proportion  of  Individuals  selecting  the 
correct  option  for  a  given  question  and  Is  dependent  on  the  ability  In  the  saaple  (sasple 
specificity).  Difficulties  have  a  range  of  .00  to  1.00.  Iteas  with  values  between  .00  and  .30 
have  a  low  proportion  of  people  selecting  the  correct  option  and  therefore  are  considered  to  be 
hard.  Iteas  with  values  between  .70  and  1.00  have  a  high  proportion  of  people  selecting  the 
correct  option  and  therefore  are  considered  easy.  Interpretation  can  be  confusing  In  that  an 
Itea  with  a  high  Itea  difficulty  Index  (e.g.,  .90)  Is  an  easy  Itea.  The  converse  Is  that  an  Item 
with  a  low  difficulty  Index  (e.g.,  .20)  Is  a  difficult  Itea. 

The  reader  should  note  that  the  following  discussion  wst  be  Interpreted  with  care  due  to  the 
way  In  which  the  Itea  difficulty  values  were  distributed.  The  categories  reported  In  Table  4  are 
arbitrary  and  other  categories  could  have  been  generated  (e.g.,  .31  to  .50),  resulting  In 
slightly  different  suaaarlzatlons.  Furtheraore,  scores  within  a  category  aay  be  farther  apart 
(e.g.,  .22  and  .39)  than  scores  across  two  category  boundaries  (e.g.,  .39  and  .42).  These 
category  boundaries  were  selected  because  of  historical  context,  and  are  therefore  aeaningful  In 
context.  The  reader  should  also  note  that  the  Itea  d1f*1culty  and  Itea  discrimination  Indices 
are  saaple-specific.  Since  the  saaples  In  this  study  are  irot  saaples  of  applicants,  but  rather, 
officer  cadets  and  enlisted  personnel,  the  results  aay  not  be  Identical  to  those  for  operational 
saaples. 

As  can  be  seen  In  Table  4,  most  Iteas  In  Fora  0  on  this  saaple  range  In  difficulty  from  .21 
to  .80.  Four  subtests  have  no  Itea  difficulties  below  .4)  (Reading  Coaprehenslon,  Math 
Knowledge,  Block  Counting,  and  Hidden  Figures),  while  one  subtest  (Table  Reading)  has  the 
majority  of  Its  Itea  difficulties  In  the  .81  to  .99  category  (ZS  of  40).  Only  two  subtests  have 
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Table  4«  OlitHbutlon  of  Itea  Difficulty 


**Ana1yze<l  as  a  speeded  test. 


Item  below  .21,  with  each  having  Just  one  Itea  In  that  range.  This  Indicates  that  the  AFOQT  has 
veiy  few  extreaely  difficult  Iteas  and  only  one  subtest  (Table  Reading)  contains  a  aajorlty  of 
extreaely  easy  Iteas.  As  shown  In  Table  5,  the  aedlan  level  of  Itea  difficulty  for  nine  subtests 
falls  In  the  .41  to  .60  category,  whereas  six  subtests  have  a  aedlan  difficulty  In  the  .61  to  .80 
category.  Thus,  aost  subtests  are  of  average  difficulty,  with  about  one-third  of  the  subtests 
being  above  average  on  the  difficulty  Index  (I.e.,  easier  subtests). 

The  Iteas  In  Fora  also  fall  aalnly  In  the  difficulty  range  of  .21  to  .80.  A  notable 
exception  Is  Table  Reading,  with  the  aajorlty  of  Itea  difficulty  values  between  .81  and  .99  (27 

of  40).  This  reveals  that  Table  Reading  1$  relatively  easy*  No  subtest  has  Iteas  with 

difficulties  below  .21.  Eight  subtests  In  Fora  P]  have  aedlans  In  the  .41  to  .60  category  and 
seven  subtest  difficulty  aedlans  fall  between  .61  and  .80.  This  shows  that  about  half  of  the 
subtests  are  of  average  difficulty  and  about  half  of  the  subtests  are  above  average  on  the 
difficulty  Index  (I.e.,  easier  subtests). 

The  Iteas  In  P2  also  fall  aalnly  In  the  range  froa  .21  to  .80.  As  with  Fora  P^,  most  of 
the  Iteas  In  Table  Reading  fall  In  the  .81  to  .99  category  (29  of  40).  As  before.  Table  Reading 
appears  to  be  one  of  the  easier  subtests.  Only  five  subtests  have  aedlans  In  the  .41  to  .60 
range,  while  nine  fall  In  the  .61  to  .80  range.  This  Indicates  that  about  one-third  of  the 
subtests  In  P2  are  of  average  difficulty  and  about  one-half  are  above  average  on  the  difficulty 
Index. 

The  first  set  of  co^rarlsons  aaong  the  three  test  fores  focuses  on  changes  In  the 

distributions  of  Itea  difficulty  froa  Fora  0  to  Fores  P^  and  P2.  Collectively,  the 

distributional  statistics  In  Tables  4  and  5  Indicate  that,  relative  to  the  Iteas  In  Fore  0,  Iteas 
In  Fores  P]  and  P2  have  shifted  toward  the  easier  end  of  the  difficulty  continuua.  Four 
subtests  In  both  Fores  P-|  and  P2  consistently  have  higher  man  Item  difficulty  values 
(greater  than  .02  points),  and  usually  higher  aedlan,  alnlaua,  and  aaxlaua  values,  than  the  saae 
subtests  In  Fores  0.  These  subtests  are  Arlthaetic  Reasoning,  Data  Interpretation,  Scale 
Reading,  and  General  SclerKe.  An  additional  four  subtests  are  easier  In  only  one  of  the  new 
form:  Block  Counting  In  Fora  P]  and  Verbal  Analogies,  Word  Knowledge,  and  Instruaent 
Comprehension  In  Fora  P2.  Several  exceptions  to  the  trend  toward  easier  Iteas  In  Form  P  are 
noteworthy.  The  difficulty  of  the  Iteas  In  the  Aviation  Information,  Math  Knowledge,  and  Table 
Reading  subtests  1$  com«rab1e  across  forms.  Further,  two  Fora  0  subtests  contain  Items  which 
are  easier  on  the  average  than  those  In  either  Fora  P^  or  P2  (Electrical  Mare  and  Hidden 
Figures).  Three  additional  subtests  In  Fora  0  are  easier  than  those  In  Fora  P]  only  (Reading 
Comprehension,  Mechanical  Comprehension,  and  Rotated  Blocks). 

The  second  set  of  comparisons  addresses  the  comparability  of  difficulty  of  the  Items  In  the 
new  test  forms.  As  shown  In  Tables  4  and  5,  the  difficulty  level  of  Items  In  about  one-third  of 
the  subtests  Is  highly  similar  In  Form  P^and  P2  (Arithmetic  Reasoning.  Electrical  Mare, 
Scale  Reading,  Table  Reading.  General  Science,  and  Hidden  Figures).  In  eight  of  the  10  remaining 
subtests.  Fora  P2  clearly  contains  aore  Item  of  lower  difficulty.  The  trend  toward  easier 
Item  In  Fora  P2  Is  aost  pronounced  In  the  Reading  Comprehension.  Data  Interpretation, 
Mechanical  Comprehension,  Instruaent  Coaprehenslon,  and  Aviation  Informtion  subtests.  The  same 
pattern  Is  evident  but  the  difference  In  average  Item  difficulty  between  Form  P7  and  Pg  Is 
less  In  the  Verbal  Analogies,  Word  Knowledge,  and  Rotated  Blocks  subtests.  Results  Indicate  that 
only  two  subtests  are  easier  In  Fora  P2  than  In  Fora  Pi  (Math  Knowledge  and  Block  Counting). 
Test  equating  procedures  were  applied  later  to  ensure  that  equivalent  scale  scores  were  derived 
for  Form  P]  and  P2,  despite  the  observed  differences  In  Item  difficulty. 

Itea  Discrimination.  Itea  discrimination  was  operationally  defined  as  the  biserlal 
correlation  between  the  score  on  an  Individual  Item  (0  ■  Incorrect,  1  •  correct)  and  the  subtest 
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Tabu  S.  SuMMiy  StatUtlca  of  Itaa  Olfflcolty 
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total  score.  Itaas  with  dlscrlelnatlon  values  below  .21  are  typically  viewed  as  having  poor 
d1  serial  native  power  while  1tea«  above  .81  are  viewed  as  having  excellent  d1  serial  native  power. 
Itea  d1 serial nation  data  are  presented  In  Tables  6  and  7. 

The  aajorlty  of  the  Itea  discrialnatlon  values  for  Fora  0  fall  In  the  .41  to  .80  range, 
suggesting  that  aost  Item  have  average  to  above  average  d1  serial  native  power.  Verbal  Analogies. 
Reading  Coaprehenslon.  Nath  Knowledge,  and  Table  Reading  each  have  several  Iteas  with 
discrialnatlon  values  between  .81  and  .99.  While  no  subtest  has  Iteas  with  values  below  .21. 
seven  subtests  have  at  least  one  Itea  In  the  .21  to  .40  range  (Scale  Reading  has  8  of  39  Iteas  In 
the  latter  category).  The  aedlan  discrialnatlon  value  for  twelve  subtests  falls  In  the  .61  to 
.80  category.  Indicating  the  Iteas  as  a  whole  have  good  discrialnatlon  abilities.  Four  subtest 
aedlan  discrialnatlon  values  (Data  Interpretation.  Nechanical  Coaprehenslon.  Scale  Reading,  and 
General  Science)  are  between  .41  and  .60.  Indicating  aoderatc  discrialnatlon  abilities. 

The  discrialnatlon  pattern  for  Fora  F|  Is  soaowhat  slallar  to  that  for  Fora  0.  with  the 
aajorlty  of  values  falling  In  the  .41  to  .80  range.  While  six  subtests  have  at  least  one  highly 
discrialnating  Itea  (I.e..  In  the  .81  to  .99  range),  nine  have  Iteas  of  below  average 
discrialnative  power  (I.e..  Iteas  In  the  .21  to  .40  category).  Scale  Reading  has  10  of  40  Iteas 
In  the  latter  category.  Five  subtests  (Verbal  Analogies.  Data  Interpretation.  Mechanical 
Coaprehenslon.  Electrical  Maxe.  and  Scale  Reading)  have  aedlan  values  In  the  .41  to  .60  range 
while  eleven  subtests  have  aedlans  In  the  .61  to  .60  range.  On  the  whole.  Fora  Pi  Iteas  have  a 
good  ability  to  make  dlscrlalnatlons  aaong  Individuals. 

Most  Fora  P2  Iteas  have  discrialnatlon  values  In  the  .61  to  .80  category.  Eight  subtests 
have  at  least  one  Itea  In  the  .81  to  .99  range,  with  Math  Knowledge  and  Instruaent  Coaprehenslon 
having  approxiaately  SOS  of  the  Iteas  In  that  range.  Eight  subtests  have  Iteas  In  the  below 
average  category,  with  Scale  Reading  having  8  of  Its  40  Iteas  there.  Only  three  subtests  have 
aedlan  discrialnatlon  values  In  the  .41  to  .60  category  (Mechanical  Coaprehenslon.  Electrical 
Maze,  and  Scale  Reading),  whereas  thirteen  subtests  have  aedlan  discrialnatlon  values  spread 
between  .61  and  .80.  In  short.  Fora  P2  also  shows  good  discrialnatlon  ability. 

In  general,  the  distributions  of  Itea  discrialnatlon  values  for  Foras  P^  and  P2  are 
either  slallar  to  Fora  0  distributions  or  tend  to  shift  to  higher  levels  of  discrialnatlon.  In 
seven  subtests,  the  aean  and  aedlan  discrialnatlon  values  for  Iteas  In  both  Foras  P]  and  P2 
exceed  those  for  Iteas  In  Fora  0  by  at  least  .03  (Arlthaetlc  Reasoning,  Data  Interpretation,  Word 
Knowledge,  Nath  Knowledge,  Mechanical  Coaprehenslon,  Scale  Reading,  and  General  Science).  The 
saae  result  Is  seen  for  two  additional  subtests  In  Fora  P2  only  (Instruaent  Coaprehenslon  and 
Aviation  Inforaatlon).  Iteas  In  the  saae  subtests  on  the  other  new  test.  Fora  Pi,  are 
coaparable  In  discrialnative  power  to  that  of  Iteas  In  Fora  0.  Of  the  reaalning  seven  subtests, 
only  Electrical  Naze  IteMs  are  clearly  superior  In  dlsciialnablllty  on  Fora  0.  The  rest  of  the 
subtests  are  either  slallar  aaong  the  three  foras  (Hidden  Figures)  or  provide  soaewhat  better 
discrialnatlon  In  Fora  0  than  In  one  (but  not  botti)  of  the  new  foras  (Verbal  Analogies,  Reading 
Coaprehenslon,  Block  Counting.  Table  Reading,  and  Rotated  Blocks). 

A  coaparison  between  Foras  P]  and  P2  shows  ^t  aost  subtests  are  coaposed  of  Iteas  with 
highly  slallar  discrialnative  power.  The  consistency  Is  observed  In  the  distribution  of  Itea 
discrialnatlon  values  In  Table  6  and  In  the  sunary  statistics  In  Table  7.  In  ten  subtests  the 
aean  discrialnatlon  values  for  the  two  new  foras  differ  by  .02  or  less.  In  the  other  six 
subtests,  one  test  fora  Is  clearly  superior  to  the  other.  Both  the  aean  and  aedlan  Itea 
discrialnatlon  values  are  higher  In  the  Block  Counting  and  Table  Reading  subtests  on  Fora 
and  In  the  Reading  Coaprehenslon,  Data  Interpretation,  Instruaent  Coaprehenslon,  and  Rotated 
Blocks  subtests  on  Fora  P2.  Although  Foras  P^  and  P2  are  not  precisely  equivalent  In  Itea 
discrialnatlon,  the  purpose  of  the  fo11ow>on  test  equating  analyses  Is  to  ensure  that  converted 
test  scores  will  be  directly  equivalent. 
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Table  7.  SwMry  Statistics  of  Itaa  IHscrlalMtloa 


21  S  S 


<n  m  g  g  cvj 


3 

• 

3 

S 

s 

8 

a 

8 

U> 

ra. 

s 

S 

» 

a 

10 

faa 

a 

10 

00 

a 

to 

raa 

a 

r^a 

|S» 

a 

10 

faa 

CM 

♦ 

• 

S 

e 

00 

a 

10 

CM 

a 

fO 

a 

fN. 

lA 

a 

U) 

<n 

a 

9l 

CM 

a 

a 

fka 

m 

a 

Ol 

CM 

a 

9 

a 

s 

a 

a 

fO 

«o 

m 

fS 

3 

at 

to 

3 

3 

s 

m 

10 

rw 

<0 

10 

01 

10 

to 

to 

3 

3 

a 

.09 

CM 

a 

CM 

a 

s 

a 

aw 

a 

.09 

o 

a 

.13 

o 

a 

«n 

a 

60* 

.07 

a 

.07 

H 

S 

s 

a 

s 

a 

3 

a 

01 

faa 

a 

3 

a 

3 

a 

a 

.67 

♦ 

a 

3 

a 

01 

a 

3 

a 

10 

l*a 

a 

Ol 

a 

IO 

rNk 

a 

s 

a 

C 

S 

a 

10 

o 

o 

10 

10 

01 

<n 

01 

01 

00 

3 

a 

M 

S 

a 

£■ 

z 

a 

to 

to 

a 

a 

10 

a 

to 

a 

M 

a 

CM 

a 

a 

m 

a 

ro 

10 

a 

to 

a 

S 

1 

s 

a 

a 

s 

a 

01 

10 

a 

CM 

fa* 

a 

3 

a 

3 

a 

S 

a 

S 

a 

10 

iO 

a 

.65 

cn 

a 

CM 

10 

3 

a 

.63 

R 

a 

c 

0 

CM 

iE 

a 

10 

r«« 

O 

00 

10 

CM 

3 

a 

s 

a 

Ol 

<*> 

o 

i 

10 

a 

10 

a 

10 

a 

fH 

a 

fS; 

a 

10 

a 

10 

a 

10 

a 

IO 

a 

10 

a 

IO 

a 

IO 

a 

fSk 

a 

3 

CM 

s 

a 

O 

CM 

o 

O 

a* 

01 

CM 

01 

S 

a 

10 

1^ 

a 

a 

a 

a 

a 

a 

a 

a 

a 

O 

a 

a 

O 

a 

a 

o 

a 

s 

a 

10 

faa 

a 

ran. 

00 

a 

.69 

3 

CM 

01 

a 

a 

00 

f>% 

a 

p: 

a 

3 

a 

.78 

S 

a 

10 

a 

^■a 

fs. 

a 

ra 

s 

a 

C 

ra. 

S 

a 

ra* 

o 

01 

fa* 

10 

01 

<0 

IO 

CM 

CM 

o 

rs. 

% 

X 

10 

a 

♦ 

a 

to 

a 

CM 

to 

a 

CM 

a 

a 

♦ 

a 

to 

a 

a 

lO 

a 

<*> 

a 

10 

a 

s 

1 

s 

a 

.66 

CM 

Ca*. 

a 

s 

a 

3 

a 

3 

fw 

10 

a 

.47 

3 

a 

3 

IE 

a 

10 

a 

3 

a 

.53 

IE 

s 

I 

3 

a 

10 

10 

a 

IE 

a 

CM 

10 

a 

ra* 

iO 

a 

CM 

a 

s 

a 

3 

a 

.48 

3 

a 

10 

3 

a 

3 

a 

3 

a 

to 

10 

Ol 

IO 

t:  !5  o 


£ 

S 


£ 

S 


S  §  Z  C  ja  E  m 

j  M  1 1  f  !  !‘l  ^  ='-^  I  i  ^  i 

^  “  i  £  Z 

V  ^  W 


I  5  5  8  I  u  « 

^tSSSSST 

■esiaissa-stiss-as? 

£i£3S££Sit£S^-is3s 


z 

o 

u 

111 

o 

s  ts 

£  S 

■  V. 


I 

«  10 

«  s 

N 

2l| 

m 


17 


Subt*st  *iM>1ys1s 

The  fomat  for  the  discussion  of  the  subtest  analyses  will  be  slallar  to  the  preceding  fonaat 
In  that  each  fora  of  the  AFOQT  will  be  discussed  Independently  of  the  other  forms.  For  each 
subtest,  the  discussion  will  focus  on  the  proportion  correct^  and  a  measure  of  Internal 
consistency.  Skew,  kurtosis,  and  the  Intercorrelatlons  of  the  subtests  will  be  discussed  for  a1] 
three  forms  together.  These  data  are  presented  In  Table  8  and  Table  9. 

As  discussed  earlier  In  this  paper.  Items  were  omitted  from  scoring  In  six  Form  0  subtests 
due  to  miskeys  or  poor  Item  performance.  To  facilitate  across*forms  coa^arlsons  of  the  affected 
subtests,  two  mean  scores  are  reported  for  Fora  0  (see  Tabic  8).  The  first  Is  the  actual  mean 
niaiber  of  Items  answered  correctly  and  Is  based  on  the  number  of  Items  scored.  The  second  set  of 
mean  scores  (shown  In  parentheses)  Is  adjusted  for  subtest  length.  Ratios  were  solved  to 
determine  the  Fora  0  mean  value  for  a  subtest  length  equivalent  to  that  of  Forms  Pi  and  P2* 

The  reader  should  also  note  that  scan  of  the  reliability  Indices  reported  In  Table  8  may  be 
Inflated  because  of  the  speeded  nature  of  the  subtests.  The  reliability  values  for  Electrical 
Naze,  Scale  Reading,  Instrument  Comprehension,  Block  Counting,  and  Table  Reading  are  not 
reported,  because  no  parallel  form  Indices  are  yet  available.  A  more  appropriate  measure  of 
reliability  would  Involve  the  use  of  correlation  between  separately  timed  parallel  forms.  These 
data  are  not  available  at  this  time. 

Form  0  has  five  subtests  for  which  the  average  proportion  of  IteiK  answered  correctly  Is 
greater  than  .60  and  three  subtests  with  average  proportions  less  than  .50.  The  remaining 
siirtests  fall  betMcn  .50  and  .60.  This  shows  that  most  subtests  are  average  to  below  average  In 
difficulty.  Hidden  Figures  and  Table  Reading  are  the  two  easiest  subtests  while  Electrical  Maze, 
General  Science,  and  Aviation  Information  are  the  most  difficult.  The  measures  of  Internal 
consistency  (reliability)  are  fairly  high  for  Form  0.  One  subtest  1$  below  .71,  four  subtests 
fall  In  the  range  of  .71  to  .80,  and  six  fall  In  the  range  of  .81  to  .90.  Five  subtests  are 
Judged  to  have  speeded  properties  (Electrical  Maze,  Instrument  Comprehension,  Scale  Reading, 
Table  Reading,  and  Block  Counting);  therefore.  Internal  consistency  1$  not  an  appropriate  measure 
of  reliability  for  these  subtests. 

Form  P]  has  eight  subtests  with  average  proportion  correct  values  greater  than  .60  and 
three  subtests  with  values  less  than  .50.  The  remaining  five  sid>tests  have  proportions  between 
.50  and  .60.  This  shows  that  most  subtests  are  average  to  below  average  In  difficulty  (1.e., 
easy  subtests).  In  Form  P],  Table  Reading  and  Math  Knowledge  are  the  easiest  subtests  while 
ElectHcal  Naze,  Aviation  Information,  and  Mechanical  Comprehension  are  the  most  difficult.  The 
measures  of  Internal  consistency  are  fairly  high  for  Form  Pi.  The  values  of  Internal 
consistency  measures  fall  mainly  between  .81  and  .90  (seven  subtests).  Three  subtests  have 
reliability  values  between  .71  and  .80,  and  one  subtest  has  a  value  greeter  than  .91.  As  with 
Form  0,  five  subtests  are  Judged  to  be  speeded;  therefore,  other  measures  of  reliability  need  to 
be  generated. 

Form  ?2  *1*®  •••*  *<9ht  subtests  with  average  proportion  correct  values  greater  than  .60, 
but  only  one  subtest  with  a  value  less  than  .50.  The  remaining  seven  subtests  have  proportions 
between  .51  and  .60.  This  shows  that  most  subtests  are  average  to  below  average  In  difficulty 


^e  proportion  correct  values  reported  In  Table  8  are  the  same  as  the  mean  Item  difficulty 
values  shown  In  Table  5  for  power  subtests  but  not  for  speeded  subtests.  Proportion  correct 
values  were  computed  by  dividing  the  mean  number  of  Items  answered  correctly  by  the  number  of 
Items  scored. 
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Table  8.  Descriptive  Statistics  of  Sebtests 
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Table  9.  Intercorrelatlons  fmong  Subtests 


Subtests  AR 

RC 

01 

HK 

MK 

NC 

EM 

SR 

IC 

BC 

TR 

AI 

RB 

GS 

HF 

VA 

0  .62 

.75 

.58 

.72 

.67 

.56 

.38 

.55 

.49 

.51 

.44 

.45 

.48 

.56 

■  .50 

?!  .67 

.71 

.65 

.73 

.64 

.55 

.38 

.56 

.49 

.48 

.44 

.45 

.47 

.61 

.47 

P2  .69 

.76 

.70 

.76 

.67 

.55 

.40 

.53 

.52 

.50 

.51 

.45 

.51 

.64 

.50 

AR 

0 

.63 

.68 

.51 

.77 

.55 

.47 

.71 

.50 

.56 

.49 

.37 

.55 

.54 

.49 

Pi 

.64 

.77 

.59 

.77 

.56 

.42 

.72 

.46 

.56 

.54 

.39 

.50 

.62 

.48 

P2 

.65 

.78 

.58 

.80 

.61 

.48 

.68 

.56 

.56 

.57 

.42 

.60 

.65 

.52 

RC 

0 

.61 

.80 

.67 

.54 

.39 

.52 

.45 

.47 

.44 

.46 

.45 

.62 

.45 

Pi 

.65 

.75 

.55 

.46 

.33 

.54 

.42 

.45 

.43 

.42 

.37 

.58 

.38 

P2 

.69 

.78 

.60 

.52 

.33 

.49 

.45 

.45 

.46 

.43 

.43 

.62 

.44 

01 

0 

.53 

.65 

.49 

.44 

.63 

.48 

.54 

.49 

.38 

.45 

.48 

.46 

Pi 

.57 

.66 

.51 

.42 

.70 

.48 

.55 

.56 

.42 

.48 

.56 

.49 

P2 

.59 

.71 

.58 

.43 

.66 

.52 

.54 

.58 

.41 

.57 

.60 

.51 

WK 

0 

.57 

.49 

.31 

.41 

.42 

.39 

.36 

.44 
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.58 

.38 

Pi 

.55 

.46 

.30 

.46 

.38 

.40 

.34 

.43 

.36 

.63 

.35 

P2 

.57 

.52 

.29 

.42 

.46 

.41 

.40 

.44 

.41 

.64 

.42 

MK 

0 

.55 

.48 

.68 

.48 

.58 

.54 

.38 

.57 

.59 

.53 

Pi 

.54 

.41 

.65 

.42 

.52 

.57 

.35 

.51 

.66 

.49 

P2 

.56 

.46 

.63 

.55 

.52 

.56 

.40 

.56 

.66 

.55 

MC 

0 

.50 

.50 

.56 

.51. 

.35 

.54 
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.44 

Pi 

.49 

.50 

.56 

.48 

.32 

.57 

.58 
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.40 

P2 

.52 

.48 

.62 

.46 
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.59 
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.72 

.45 

EM 

0 

.50 

.51 

.56 

.41 

.38 

.52 

.46 

.44 

Pi 

.50 

.49 

.50 

.38 

.36 

.44 

.45 

.42 

P2 

.52 

.51 

.52 

.43 
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8C 
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Pi 
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.46 

P2 
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TR 
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.32 

.45 
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Pi 

.27 

.39 
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.43 

P2 

.37 
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.39 

.48 

AI 

0 

.41 

.54 

.36 

Pi 

.40 
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.28 

P2 

.43 
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RB 
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.52 

Pi 
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GS 

0 

.41 

Pi 

.41 
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(I.e..  easy  subtests).  For  Fone  P2>  Table  Reading  and  Verbal  Analogies  are  the  easiest 

subtests  Mhlle  Electrical  Haze,  Aviation  InforMtIon,  and  Mechanical  Co^irehenslon  are  the  aost 
difficult.  As  for  the  Measures  of  Internal  consistency.  Fora  P2  has  four  subtests  with  values 
In  .the  .71  to  .80  range,  six  subtests  with  values  In  the  .81  to  .90  range,  and  one  subtest  with  a 
value  greater  than  .91.  Again,  those  tests  Judged  to  be  speeded  do  not  have  Meaningful  values  at 
this  tiae. 

For  the  coaparisons  between  foras,  the  subtests  were  Judged  to  be  slallar  In  difficulty  If 
the  actual  (or  adjusted)  aean  scores  differed  by  less  than  one  raw  score  unlt^.  Three  subtests 
1n  Fora  0  were  aore  difficult  than  the  corresponding  subtests  In  Fora  F|  (Arlthaetic  Reasoning, 
Data  Interpretation,  and  Seale  Reading)  while  two  subtests  were  easier  In  Fora  0  than  In  Fom 
P^  (Reading  Coaprehenslon  and  Electrical  Haze).  The  reaalning  11  subtests  were  Judged  to  be 

slallar  In  aean  scores.  Fora  0  had  two  subtests  that  were  aore  difficult  than  the  corresponding 
subtests  In  Fora  P2  (Arlthaetic  Reasoning  and  Data  Interpretation)  and  only  one  subtest  that 
was  easier  (Electrical  Maze).  As  for  the  coaparisons  of  Internal  consistency  Measures,  Foms 

P^  and  P2  were  very  slallar  to  Fora  0.  For  all  three  foras,  aost  reliability  values  were  In 
the  .81  to  .90  range,  with  Math  Knowledge  having  the  highest  Internal  consistency  values.  Foras 
Pi  and  P2  had  slightly  higher  values  than  Form  0.  This  Indicates  that  the  new  foras  nay  be 
slightly  aore  Internally  consistent  than  the  previous  fom. 

Fom  P2  Is  slightly  easier  than  Fom  Pi.  Two  subtests  In  P2  were  easier  than  their 

counterparts  In  Pi  (Reading  Coaprehenslon  and  Instrunent  Coaprehenslon);  none  of  the  subtests 
In  P2  was  acre  difficult  than  Its  counterpart  In  Pi.  The  reaalning  14  subtests  had  mean 
scores  that  differed  by  less  than  one  raw  score  point.  The  test  equating  analyses  to  be 
described  later  In  this  paper  have  the  effect  of  reaoving  observed  differences  In  subtest 
difficulty  froa  test  scores.  Thus,  the  test  fom  adalnlstered  becoaes  a  Matter  of  Indifference 
to  exaalnees.  As  for  Internal  consistency.  Inspection  of  Table  8  reveals  that  the  two  foras  are 
alaost  Identical. 

The  discussion  of  skew  and  kurtosis  for  the  subtests  was  delayed  until  this  point,  because 
the  pattern  of  results  Is  nearly  Identical  for  the  three  foras.  It  should  be  pointed  out  here 
that  the  norwal  distribution  has  a  value  of  0.0  for  both  skew  and  kurtosis.  For  all  three  foms, 
no  subtest  exhibited  skew  values  less  than  -1.00  or  greater  than  1.00.  This  Indicates  that  the 
subtests  are  relatively  symetiical.  As  for  kurtosis,  twelve  subtests  tend  toward  nomallty. 
Four  subtests  have  kurtosis  values  around  -1.00  or  less.  These  subtests  (Arlthaetic  Reasoning, 
Word  Knowledge,  Math  Knowledge,  and  Instruaent  Coaprehenslon)  have  slightly  flatter  distributions 
than  do  the  remaining  subtests. 

The  Intercorrelatlons  for  Foms  0  and  P  are  presented  In  Table  9.  Rather  than  presenting  a 
separate  table  for  each  of  the  three  correlation  matrices,  the  data  are  presented  In  one  table 
for  easier  coapaiison  across  foras.  Please  note  that  the  tabled  values  are  for  correlations 
between  two  subtests  for  one  fom;  they  are  not  correlations  between  foms  (e.g.,  Fom  0  with 
Fom  P^).  The  highest  correlations  are  between  Arithmetic  Reasoning  and  Math  Knowledge  and 
between  Reading  Coaprehenslon  and  Word  Knowledge.  The  former  pair  of  subtests  are  In  the 
Quantitative  composite  and  the  latter  are  In  the  Verbal  composite.  The  lowest  correlations  are 
found  between  Electrical  Maze  and  Reading  Coaprehenslon,  Electrical  Maze  and  Word  Knowledge, 
Table  Reading  and  Mechanical  Coaprehenslon,  and  Table  Reading  and  Aviation  Information.  There  Is 
a  great  amount  of  consistency  In  correlations  across  the  three  foms.  Most  of  the  differences 


^Thls  difference  was  chosen  as  a  standard  because  one  raw  score  point  can  have  operational 
Implications.  For  example.  In  soae  portions  of  the  Verbal  composite  conversion  table,  a 
difference  of  one  raw  score  unit  results  In  a  difference  of  three  percentile  units. 
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Mong  tht  triads  of  corrtlatlons  fall  within  expactad  rangas,  givwi  tba  rallabllltlas  of  tha 
subtasts.  Tha  largest  diffartnct  batwaan  aiqr  corresponding  pair  of  correlations  Is  .13.  This 
occurs  for  tha  correlations  between  Nath  Knowledge  and  Instruaent  Coaprehenslon  for  Forws  Pi 
and  P2  and  the  correlations  between  Block  Counting  and  Hidden  Figures  for  Fonas  Pi  and  0. 
Nonetheless,  there  Is  a  high  degree  of  slallaHty  In  correlation  aatrlces  across  the  three  fonas. 


The  fonaet  for  the  discussion  of  the  coaposlte  analyses  differs  from  the  prweding  foneat  In 
that  the  three  fones  of  the  AFOQT  are  not  discussed  Individually.  The  coavarlson  of  Form  0  to 
Fonas  Pi  and  P2  Is  followed  by  the  coapeiison  of  Foron  Pi  and  P2.  For  each  coogioslta. 
the  discussion  will  focus  on  the  average  proportion  of  Itaas  answered  correctly,  actual  coevoslte 
neon  scores,  and  coaposlte  owan  scores  adjusted  for  Itaoi  length.  The  skew  and  kurtosis  of  the 
coovoslte  distributions  and  the  coavoslte  Intercorrelatlons  are  also  discussed. 

As  a  result  of  the  12  Iteon  being  reanved  froai  the  scoring  of  Fona  0,  Fonas  Pi  and  P2  and 
Fona  0  have  a  different  nuadter  of  item  contributing  to  the  conposlte  scores.  Table  10  conpares 
the  maober  of  Iteas  for  Fona  0  and  Forws  Pi  and  P2  by  coogMsIte. 

Table  10.  Haeiber  of  Itaas  In  AFOQT  Fonas 
0  and  P  CoagMksItes 

Test  fona 


Composite 

0 

Pi  IP2 

Navi gator-TechnI cal 

257 

265 

Pilot 

200 

205 

Academic  Aptitude 

140 

150 

Verbal 

71 

75 

(hiantitative 

69 

75 

Table  11  presents  descriptive  data  at  the  coagwsita  level  on  which  the  following  discussion 
Is  based.  The  proportion  correct  values  for  Fona  0  coagiosltes  tend  to  be  lower  than  those  for 
both  Fonas  Pi  and  P2.  This  generalization  holds  for  the  Navigator-Technical,  Acadeailc 
Aptitude,  and  Quantitative  cowposltes.  For  these  coagwsltes  the  differences  between  Form  0 
adjusted  awan  scores  and  Fona  Pi  and  P2  actual  aman  scores  exceed  one  point.  The 
generalization  also  holds  for  the  caas>arison  of  the  Pilot  and  Verbal  coogwsltes  for  Forms  0  and 
P2.  It  does  not  hold,  however,  for  two  cases  In  the  comparison  of  Forms  0  and  Pi;  the 
differences  In  proportion  correct  values  for  the  Pilot  and  Verbal  composites  do  not  translate  to 
mean  score  differences  In  excess  of  one  point.  It  should  be  noted  here  that  the  effects  of  the 
differences  In  test  difficulty  are  removed  by  the  eguatlng  process. 

The  proportion  correct  values  for  Forms  Pi  and  P2  are  most  similar  for  the  Navigator- 
Technical  and  Quantitative  composites.  However,  an  Inspection  of  the  mean  scores  Indicates  that 
the  only  composite  on  which  Forms  Pi  and  P2  differ  by  less  than  one  raw  score  point  is  the 
(Xiantltatlve  composite.  The  order  of  the  other  composites  In  terms  of  magnitude  of  mean  score 
differences  (Teast  to  greatest)  Is  Navigator-Technical,  Verbal,  Pilot,  and  Academic  Aptitude.  A 
particularly  noteworttor  finding  Is  that  Fone  P2  has  a  higher  mean  score  on  all  composites  than 
does  Fona  Pi,  Indicating  that  Form  P2  Is  the  easier  of  the  new  AFOQT  forms.  However,  after 
equating  there  should  be  no  significant  difference  In  scores  between  Forms  Pi  and  P2. 


T«bl*  n.  Coaposit*  Dascrlptlvt  Stattstfcs 


The  skm  and  kurtosis  for  the  coaposlte  distributions  are  highly  slirilar  for  the  three 
foms.  For  all  three  fores,  all  coa^osltes  exhibited  skew  values  between  -.52  and  -.20, 
Indicating  that  the  coaposite  score  distributions  are  relatively  syawtrlcal.  As  for  kurtosis, 
four  coaposltes  exhibited  values  between  -.59  and  -.82,  Indicating  that  the  distributions  tend 
toward  nonaallty.  The  Quantitative  coaposite  had  values  of  approxiaately  -1.00,  Indicating 
slightly  flatter  distributions  than  those  for  the  other  coaposltes. 

Correlations  aaong  the  five  coaposltes  are  shown  In  Table  12.  The  correlations  are  highly 
siBllar  for  the  three  foras.  That  Is,  the  Intereorrelatlon  aatrix  for  Fora  0  Is  slallar  to  those 
of  Foias  P-|  and  P2,  with  the  latter  two  being  nearly  Identical.  The  lowest  correlations  were 
found  between  the  Vdrbal  coaposite  and  each  of  the  Pilot,  Navigator-Technical,  and  Quantitative 
coaposltes.  It  should  be  noted  that  the  correlations  between  the  Acadeafe  Aptitude  coaposite  and 
both  the  Verbal  and  Quantitative  coaposltes  are  artificially  Inflated,  because  the  Acadealc 
Aptitude  coaposite  Is  a  linear  coablnatlon  of  the  Verbal  and  Quantitative  co^iosltes.  The  high 
correlation  between  the  Pilot  and  Navigator-Technical  coaposltes  Is  also  Inflated,  because  the 
coaposltes  have  several  subtests  In  coneon. 


Table  12.  latereorrelatlons  Aaong  Coaposltes 


Nevlgator- 

Technlcal 

Iteadealc 

Aptitude 

Verbal 

Quantitative 

0 

.96 

.80 

.70 

.80 

Pilot 

Pi 

.95 

.62 

.71 

.81 

P2 

.95 

.82 

.71 

.81 

0 

.87 

.71 

.91 

Navigator- 

Pi 

.89 

.72 

.93 

Technical 

P2 

.89 

.73 

-.93 

0 

.94 

.93 

Acadealc 

Pi 

.93 

.94 

Aptitude 

P2 

.93 

.94 

0 

.73 

Verbal 

Pi 

.74 

P2 

.74 

Note.  The  correlations  are  Inflated,  since  the  coaposltes  have  several 
subtests  In  coaaon. 


Equating 

Fona  N  currently  serves  as  the  fora  on  which  the  noraative  saaple  was  constructed  for  the 
AFOQT.  Foras  P]  and  P2  were  equated  to  Fora  0  In  this  study  because  Fora  0  had  been  equated 
to  Fone  N  In  previous  research  (see  Rogers,  Roach,  i  Negner,  1966).  This  places  scores  for  Foras 
Pi  and  P2  on  the  Fora  N  aetric. 

As  discussed  earlier,  decisions  were  aade  as  to  which  of  the  four  equating  aethods  calculated 
was  aost  appropriate  for  each  of  the  five  coaposltes  of  Foras  Pi  and  P2.  Again,  selection 
was  based  on  the  slallarlty  of  the  distributions  for  the  new  test  coaposite  and  the  corresponding 
coaposite  in  the  reference  test.  Fora  0.  The  Standard  Error  of  Estlaate  (SEE)  was  used  as  a 
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9oo<lncs$H>f-f1t  Masurc  to  dotonrino  Mhich  saootiring  Mtliod  «Mu1d  be  chosen  for  each  of  the 
equlptrcrnitlte  eguatings.  If  one  Mthod  resulted  In  a  significantly  saaller  SEE  than  another, 
the  aethod  with  the  saaller  SEE  was  chosen.  When  two  fonas  of  smoothing  did  not  differ  greatly, 
the  fona  with  the  least  coa^Iex  regression  equation  was  chosen.  Table  13  contains  the  SEE  values 
for  the  three  fonas  of  saootMng  of  the  equipercentlle  equatings. 


Table  13.  Standard  firror  of  Estlaata  for  Linoar, 
Quadratic,  and  Cubic  SMottIng  of  Equlpercantllu  Equating 


Pi  equiporeontllo 

P2  oqolporeantllo 

AFOQT 

coaposite 

Linoar 

Quadretic 

Cubic 

Linear 

Quadratic 

Cubic 

Pilot 

2.79 

1.41 

1.24 

2.41 

1.15 

1.10 

Navigator- 

Technical 

3.15 

1.85 

1.82 

3.47 

1.89 

1.88 

Acadealc 

Aptitude 

2.24 

1.76 

1.50 

2.72 

1.78 

1.65 

Verbal 

1.42 

1.40 

.55 

1.05 

.64 

.44 

Quantitative 

1.49 

.78 

.56 

1.60 

.79 

.76 

The  equipercentlle  equating  aathod  was  chosen  for  all  equatings.^  Further,  the  selection 
of  polynoala]  saoothing  aathod  for  each  coaposite  was  govenwd  by  Its  Joint  perfonaance  on  both 
of  the  new  test  fonas.  Thus,  the  type  of  saoothing  alght  vaiy  aaong  the  coaposites  but  not  on 
any  single  coaposite  for  both  Fonas  Fi  and  F2*  Navigator-Technical,  and 

Quantitative  coaposites,  a  quadratic  polynoarial  saoothing  aethod  was  selected.  For  these 
coaposites,  the  SEE  values  decreased  significantly  froa  linear  to  quadratic  foras  of  saoothing, 
but  only  trivially  to  cubic.  For  the  Acadanrfc  Pptitude  and  Verbal  coaposites,  sufficient 
decreases  In  SEE  ware  found  froa  the  quadratic  to  the  cubic  polynoalal  saoothing  for  both  foras. 
Although  the  decrease  for  Fora  P2  1$  less  than  the  decrease  for  Fora  P],  the  cubic  saoothing 
aethod  was  selected  for  both  coaposites  for  the  s^e  of  logical  consistency.' 

Upon  further  Inspection  of  the  equatings.  It  was  detemlned  that  separate  conversion  tables 
ware  needed  for  Foras  F|  and  Pg,  The  equivalent  raw  scores  on  Foras  P7  and  P2  equated  to 
different  raw  scores  on  Fora  0  due  to  the  differences  between  Foras  Pi  and  P2.  Although  the 
foras  were  developed  to  be  parallel  In  content,  foraat,  and  so  on,  slight  differences  In  raw 
score  distributions  were  apparent.  For  exauple.  In  the  Pilot  coaposite  In  the  range  of  the  10th 
through  90th  percentiles,  the  raw  scores  for  the  saae  percentile  on  the  two  foras  differed  froa  2 


Equipercentlle  equatings  were  selected  in  lieu  of  linear  (z-score)  equatings.  For  all 
coaposites  on  both  Foras  Pi  and  P2,  the  Indexes  of  deviation  (bias,  average  absolute 
deviation,  and  root  aean  square  deviation)  between  the  accepted  polynoalal  saoothing  and  linear 
(z-score)  equating  were  usually  greater  than  .5  raw  score  points  and  often  as  large  as  2  to  3 
points.  The  aagnltudes  of  these  differences  were  Judged  too  large  to  allow  linear  equatings  of 
the  tests. 

^A11  equatings  will  be  reviewed  and  evaluated  In  the  lOTAE  and  new  tables  provided  as 
dictated  by  the  data. 
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to  4  points.  Thorefore,  soparato  tables  Mere  required.  Ibe  tables  In  Appendix  A  convert  raw 
scores  for  each  coaposlte  of  each  version  of  Fona  P  to  the  percentile  score  based  on  the  Fora  N 
Metric. 


IV.  OONCLIISIOItS  AND  KCCflWOCATIONS 

The  goals  of  this  research  Mere  to  deterwlne  the  adequacy  of  Iteas  In  Foras  P  using  a  aore 
representative  saaple,  to  detervine  If  Foiau  0.  P^.  and  P2  are  parallel,  and  to  derive  scores 
on  Foms  P  that  are  coaparable  to  scores  on  For*  0. 

Enlisted  personnel  (WTS  subjects)  and  prospective  officers  (cadets  In  AFROTC  and  OTS)  Mere 
used  to  collect  data  for  the  purpose  of  exaalnlng  Itea  perfOraance  In  Foras  P.  Based  on  the  Itaa 
difficulty  results.  It  can  be  concluded  that  the  Itaas  In  Foras  P  are  Kceptable,  since  the 
aejorlty  of  Itaas  fall  Mithin  a  desirable  range  of  difficulty.  Even  though  Foras  P  are  slightly 
easier  than  Font  0,  the  Itea  difficulty  distributions  are  slallar  enough  to  proceed  Mith  test 
equating  In  order  to  reaove  the  effects  of  saall  differences  In  difficulty. 

Based  on  Itea  d1  serial  nation  results,  the  Itaas  In  Foras  P  are  acceptable.  On  the  Mhole. 
Itaas  In  Foras  P  have  slaflar  or  slightly  higher  dlsciialnatlon  values  than  those  of  Fora  0.  The 
slallarity  across  foras  Is  even  greater  Mhen  coaparing  Fora  Pi  Mith  Fora  P2.  It  can  be 
concluded  that  the  new  forms  have  a  slightly  better  ability  to  discrialnate  aaong  exaalnees  of 
differing  ability  levels. 

The  aajority  of  subtests  have  slallar  aean  scores  across  the  three  foms.  Fom  0  has  aore 
difficult  subtests  In  three  cases  and  easier  subtests  In  tMO  cases,  but  the  three  foras  provide 
alaost  Identically  shaped  score  distributions.  The  skew  and  kurtosis  values  Indicated  that  the 
Majority  of  subtests  are  syaaetrical  and  tend  toward  noraallty.  Furtheraore.  there  Is  great 
consistency  In  these  values  across  the  three  foras.  The  three  foras  of  the  AFOQT  also  shOM  great 
consistency  In  the  Intercorrelatlon  aatrices.  Therefore,  it  can  be  concluded  that  at  the  subtest 
raM  score  level,  the  three  foras  are  generally  parallel. 

At  the  coaposlte  raw  score  level,  the  foras  are  also  generally  parallel.  The  proportion 
correct  values  for  the  coaposites  tend  to  be  higher  for  Foras  P  than  Fom  0.  This  holds  for  all 
five  coaposites  of  Fom  Pg  and  for  three  coaposites  of  Fom  P].  The  proportion  correct 
values  for  the  reaalnlng  two  coaposites  of  Fom  P7  are  slallar.  All  coaposlte  score 
distributions  are  roughly  syaaetric,  with  Moderate  peaks.  These  three  foras  of  the  AFOQT  are 
Moderately  parallel  and,  therefore,  appropriate  for  equating. 

Given  that  Foras  P  are  generally  parallel  to  Fom  0,  It  was  possible  to  derive  scores  on 
Foms  P  which  are  equivalent  to  scorns  on  Fom  0.  Equlpercentlle  equatings  with  either  quadratic 
or  cubic  saoothings  wem  used  to  generate  provisional  convarslon  tables  (Appendix  A).  For  three 
of  the  coaposites  (Pilot.  Navigator-Technical,  and  Quantitative),  quadratic  saoothlng  was 
selected;  for  the  reaalnlng  two  coaposites  (Verbal  and  Acadaaric  Aptitude),  cubic  saoothlng  was 
selected.  The  aethod  of  saoothlng  selected  for  each  coaposlte  was  the  saae  for  Foms  P;  and 
P2.  For  exanple,  quadratic  saoothlng  was  selected  for  the  Pilot  coaposites  of  both  Foras  P] 
and  P2.  These  tables  am  racoaaandad  for  use  operationally  until  the  coapletlon  of  an  Initial 
Operational  Test  and  Evaluation  (lOTIE). 

The  goal  of  the  planned  lOTBE  Is  to  verify  the  conversion  tables  generated  by  this  pm- 
lapleaentatlon  research.  The  Methodology  will  be  slallar  to  that  described  In  this  paper  In  that 
Foms  0  and  P  will  be  distributed  alternately  within  each  besting  session  at  each  testing  site. 
The  data  editing  and  analysis  will  mseable  those  reported  hem  but  will  produce  conversion 
tables  based  on  opemtional  data.  The  extent  of  the  changes.  If  any.  that  will  need  to  be  aade 
to  the  final  Foras  P  operational  convarslon  tables  cannot  be  detemlned  at  this  tlae. 
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WEHDIX  A;  PROVISIONAL  CONVERSION  TABLES  FOR  AFOOT  l>i  AND 


Table  A-1.  AFOQT  -  Provisional  Conversion  Table 
for  Pilot  Coaposite 


Raw 

Percentile 

Raw 

Percentile 

00  -  41 

01 

117 

48 

42  -  47 

02 

118 

50 

48  >  54 

03 

119 

51 

55  -  58 

04 

120 

52 

59-61 

05 

121 

53 

62  -  64 

06 

122 

54 

65  -  67 

07 

123 

55 

68  -  70 

08 

124 

57 

71 

09 

125 

58 

72  -  73 

10 

126 

60 

74  -  75 

11 

127 

61 

76  -  77 

12 

128 

62 

78  -  79 

13 

129 

63 

80 

14 

130 

64 

81 

IS 

131 

65 

82 

16 

132 

66 

83  -  85 

17 

133 

67 

86 

18 

134 

69 

87 

19 

135 

70 

88  •  89 

20 

136 

71 

90 

21 

137 

73 

91 

22 

138 

74 

92 

23 

139 

76 

93  -  94 

24 

140 

77 

95 

25 

141 

78 

96 

26 

142 

79 

97 

27 

143 

80 

98 

28 

144 

81 

99 

29 

145 

82 

100 

30 

146 

83 

101 

31 

147  -  148 

84 

102 

32 

149  -  150 

86 

103 

33 

151 

87 

104 

34 

152 

88 

105 

35 

153 

89 

106 

36 

154 

90 

107 

37 

155 

91 

108 

38 

156 

92 

109 

39 

157 

93 

110 

41 

158 

94 

111 

42 

159  -  160 

95 

112 

43 

161  -  163 

96 

113 

44 

164  -  167 

97 

114 

45 

168  -  172 

98 

115 

46 

173  -  205 

99 

116 

47 

30 


Table  A-2.  AFOQT  -  Provisional  Conversion  Table 
for  Nav1gator>Techn1cal  Coaposite 


Raw 

Percentile 

Raw 

Percentile 

01-61 

01 

157 

50 

62-71 

02 

158 

51 

72  -  77 

03 

159 

52 

78  -  82 

04 

160 

53 

83  -  86 

05 

161 

54 

87-88 

06 

162 

55 

89  -  90 

07 

163 

56 

91  -  94 

08 

164 

57 

95  -  97 

09 

165 

58 

98  -  99 

10 

166 

59 

100  -  102 

11 

167 

60 

103  -  104 

12 

168 

61 

105  -  106 

13 

169 

62 

107  -  108 

14 

170 

63 

109  -  110 

15 

171 

64 

111  -  113 

16 

172  -  173 

65 

114  -  115 

17 

174 

66 

116  -  117 

18 

175 

67 

118 

19 

.  176 

68 

o 

CM 

20 

177 

69 

121  -  122 

21 

178 

70 

123  -  124 

22 

179 

71 

125 

23 

180 

72 

126 

24 

181  -  182 

73 

127  -  128 

25 

183 

74 

129 

26 

184 

75 

130 

27 

185 

76 

131 

28 

186 

77 

132 

29 

187 

78 

133  -  134 

30 

188  -  189 

79 

135 

31 

190 

80 

136 

32 

191  -  192 

81 

137 

33 

193 

82 

138 

34 

194  -  195 

83 

139 

35 

196 

85 

140  -  141 

36 

197 

86 

142 

37 

198  -  199 

87 

143  -  144 

38 

200  -  201 

88 

145 

39 

202  -  203 

89 

146 

40 

204  -  205 

90 

147 

41 

206  -  207 

91 

148 

42 

208 

92 

149 

43 

209  -  210 

93 

150 

43 

211  -  212 

94 

151 

44 

213  -  215 

95 

152 

45 

216  -  219 

96 

153 

46 

220  -  224 

97 

154 

47 

225  -  227 

98 

155 

48 

228  -  265 

99 

156 

49 

31 


Table  A-3.  AFOQT  -  R|  Provlsfonel  Conversion  Table 
for  Acadealc  Aptitude  Coaposlte 


Ram 

Percentile 

Raw 

Percentile 

01-27 

01 

95 

50 

28  -  34 

02 

96 

51 

35-40 

03 

97 

52 

41  -  42 

04 

98 

53 

43  -  46 

05 

99 

54 

47-48 

06 

100 

57 

49-51 

07 

101 

59 

52 

08 

102 

61 

53  -  55 

09 

103 

62 

56-58 

10 

104 

63 

59 

11 

105 

65 

60 

12 

106 

67 

61 

13 

107 

68 

62 

14 

108 

69 

63 

15 

109 

70 

64  -  65 

16 

110 

71 

66 

17 

111 

72 

67-68 

18 

112 

75 

69  -  70 

19 

113 

76 

71 

20 

114 

78 

72 

21 

115 

79 

73 

22 

116 

80 

74 

23 

117 

81 

75 

24 

118 

82 

76 

25 

119 

83 

77 

26 

120 

84 

78 

27 

121 

85 

79 

28 

122 

86 

80 

29 

123 

87 

81 

31 

124 

88 

82 

33 

125  -  126 

89 

83 

34 

127 

90 

84 

35 

128 

91 

85 

36 

129 

92 

86 

37 

130  -  131 

93 

87 

38 

132 

94 

88 

40 

133  -  134 

95 

89 

41 

135  -  136 

96 

90 

43 

137  -  139 

97 

91 

44 

140  -  141 

98 

92 

45 

142  -  150 

99 

93 

47 

94 

49 

} 
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Table  A-4.  AFOOT  -  Fj  Provisfonal  Conversion  Table 
for  Verbal  Coaposlte 


Raw 

Percentile 

Raw 

Percentile 

01-13 

01 

43 

11 

14 

02 

44 

46 

15  -  17 

03 

45 

48 

18 

04 

46 

50 

19 

05 

47 

53 

20 

06 

48 

55 

21 

07 

49 

57 

22 

08 

50 

60 

23 

09 

51 

62 

24 

10 

52 

64 

25 

11 

53 

67 

26 

12 

54 

72 

27 

13 

55 

74 

28 

14 

56 

77 

29 

15 

57 

78 

30 

17 

58-59 

81 

31 

18 

60 

84 

32 

19 

61 

86 

33 

21 

62 

87 

34 

23 

63 

90 

35 

24 

64 

92 

36 

26 

65  •  66 

93 

37 

30 

67 

96 

38 

32 

68 

97 

39 

33 

69  -  70 

98 

40 

36 

71-75 

99 

41 

38 

42 

40 

33 


Table  A-5«  AFOQT  -  P7  Provisional  Conversion  Table 
for  Quantitative  Coaposlte 


Raw 

Percentile 

Raw 

Percentile 

01  -  14 

01 

49 

52 

15  -  17 

02 

50 

54 

18  -  20 

03 

51-52 

57 

21 

04 

53 

59 

22  -  23 

05 

54 

61 

24 

06 

55 

64 

25 

08 

56 

66 

26  -  27 

09 

57 

69 

28 

10 

58 

71 

29 

11 

59 

75 

30  -  31 

14 

60 

76 

32 

15 

61 

78 

33 

17 

62 

80 

34 

19 

63 

82 

35  -  36 

21 

64 

85 

37 

24 

65 

86 

38 

26 

66 

88 

39 

28 

67 

90 

40 

31 

68 

92 

41  -  42 

33 

69 

93 

43 

34 

70 

94 

44 

38 

71 

95 

45 

41 

72 

96 

46 

43 

73 

97 

47 

45 

74 

98 

48 

48 

75 

99 

34 


Table  A-&.  AFOQT  -  P2  Provisional  Conversion  Table 
for  Pilot  Coaposite 


Raw 

Percentile 

Raw 

Percentile 

00-44 

01 

121 

50 

45  -  49 

02 

122 

51 

50-56 

03 

123 

52 

57-61 

04 

124 

53 

62  -  63 

05 

125 

54 

64  -  67 

06 

126 

55 

68  -  70 

07 

127 

56 

71-72 

08 

128 

57 

73 

09 

129 

58 

74  -  75 

10 

130 

60 

76  -  78 

11 

131 

62 

79-80 

12 

132 

63 

81  -  82 

13 

133 

64 

83 

14 

134 

65 

84 

15 

135 

66 

85 

16 

136 

67 

86  -  87 

17 

137 

69 

88 

18 

138 

70 

89 

19 

139 

71 

90  -  91 

20 

140 

73  - 

92 

21 

141 

74 

93  -  94 

22 

142 

75 

95 

23 

143 

76 

96  -  97 

24 

144 

77 

98 

25 

145 

78 

99 

26 

146 

79 

100 

27 

147 

81 

101 

28 

148 

82 

102 

29 

149 

83 

103 

30 

1 50  -  1 51 

84 

104 

31 

152 

85 

105 

32 

1 53  -  1 54 

86 

106 

33 

155 

87 

107 

34 

156 

88 

108 

35 

157 

89 

109 

36 

158 

91 

no 

37 

159 

92 

111 

38 

160 

93 

112 

39 

161  -  162 

94 

113 

41 

163  -  164 

95 

114 

42 

165  -  166 

96 

115 

43 

167  -  171 

97 

116 

44 

172  -  176 

98 

117 

45 

177  -  205 

99 

118 

46 

119 

47 

120 

48 

35 


Table  A-7.  fFOffT  -  Pg  Provisional  Conversion  Table 
for  Navigator-Technieal  Co^ioslte 


► 


( 


Ra«r 

Percentile 

Raw 

Percentile 

00  -  59 

01 

159 

50 

60-71 

02 

160 

51 

72  -  77 

03 

161 

52 

78-82 

04 

162 

53 

83-85 

05 

163 

54 

86-88 

06 

164 

55 

89-90 

07 

165 

56 

91-94 

08 

166 

57 

95  -  97 

09 

167 

58 

98-100 

10 

168 

59 

101  -  102 

11 

169 

60 

103  -  104 

12 

170 

61 

105  -  106 

13 

171 

62 

107  -  109 

14 

172 

63 

110  -  111 

15 

173 

64 

112  -  113 

16 

174  -  175 

65 

114  -  116 

17 

176 

66 

117  -  118 

18 

177 

67 

119 

19 

178 

68 

120  -  121 

20 

179 

69 

122  -  123 

21 

180 

70 

124  -  125 

22 

181 

71 

126 

23 

182 

72 

127 

24 

183  -  184 

73 

128  -  129 

25 

185 

74 

130 

26 

186 

75 

131 

27 

187 

76 

132 

28 

188 

77 

133 

29 

189 

78 

134  -  135 

30 

190  -  191 

79 

136  -  137 

31 

192 

80 

138 

32 

193  -  194 

81 

139 

33 

195 

82 

140 

34 

196 

83 

141 

35 

197 

84 

142 

36 

198 

85 

143 

37 

199 

86 

144  -  145 

38 

200  -  201 

87 

146 

39 

202  -  203 

88 

147 

40 

204  -  205 

89 

148 

41 

206  -  207 

90 

149 

42 

208-209 

91 

150  -  151 

43 

210 

92 

152 

44 

211  -  212 

93 

153  -  154 

45 

213  -  214 

94 

155 

46 

215  -  217 

95 

156 

47 

218  -  221 

96 

157 

48 

222  -  225 

97 

158 

49 

226  -  229 

98 

230  -  265 

99 

36 


Table  A-8.  AFOQT  -  P2  Provisionel  Conversion  Table 
for  Acadealc  Aptitude  Coaposite 


Raw 

Percentile 

Raw 

Percentile 

00-27 

01 

100 

51 

28  -  35 

02 

101 

52 

36  -  41 

03 

102 

53 

42-44 

04 

103 

54 

45-48 

05 

104 

57 

49-50 

06 

105 

59 

51-53 

07 

106 

61 

54 

08 

107 

62 

55-58 

09 

108 

63 

59-60 

10 

109 

65 

61 

11 

no 

67 

62 

12 

111 

68 

63-64 

13 

112 

69 

65 

14 

113 

70 

66 

15 

114 

71 

67  -  68 

16 

115 

72 

69 

17 

116 

75 

70  -  72 

18 

117 

76 

73 

19 

118 

78 

74 

20 

119 

79 

75 

21 

120 

80 

76 

22 

121 

81 

77 

23 

122 

82 

78 

24 

123 

83 

79 

25 

124 

84 

80 

26 

125 

85 

81 

27 

126 

86 

82 

28 

127 

87 

83  -  84 

29 

128 

88 

85 

31 

129 

89 

86 

33 

130 

90 

87 

34 

131 

91 

88 

35 

132 

92 

89 

36 

133  -  134 

93 

90 

37 

135 

94 

91 

38 

136  -  138 

95 

92 

40 

139  -  140 

96 

93 

41 

141  -  142 

97 

94 

43 

143  -  144 

98 

95 

44 

145  -  150 

99 

96 

45 

97 

47 

98 

49 

99 

50 

Table  A-9.  AFOQT  -  Provisional  Conversion  Table 
for  Verbal  Coaposlte 


Ran 

Percentile 

IUm 

Percentile 

I 

8 

01 

45 

40 

15 

02 

46 

41 

16  -  18 

03 

47 

44 

19 

04 

48 

46 

20 

05 

49 

48 

21 

06 

so 

50 

22 

07 

51 

53 

23  -  24 

08 

52 

55 

25 

09 

53 

57 

26 

10 

54 

60 

27 

11 

55 

62 

28 

12 

56 

67 

29 

13 

57 

69 

30 

14 

58 

72 

31 

15 

59 

74 

32 

17 

60 

77 

33 

18 

61 

78 

34 

19 

62 

81 

35 

21 

63 

84 

36 

23 

64 

86 

37 

24 

65 

87 

38 

26 

66 

90 

39 

27 

67 

92 

40 

30 

68 

93 

41 

32 

69 

96 

42 

33 

70 

97 

43 

36 

71 

98 

44 

38 

72  -  75 

99 

Table  A-10.  AFOQT  -  P2  Prflvlsionel  Conversion  Table 
for  Quantitative  Co^MsIte 


R«m 

Percentile 

Raw 

Peocentlle 

00-12 

01 

50 

52 

13  -  16 

02 

51 

54 

17  -  19 

03 

52 

57 

20 

04 

53 

59 

21-22 

05 

54 

61 

23 

06 

55 

64 

24  -  25 

08 

56 

66 

26 

09 

57-58 

69 

27-28 

10 

59 

71 

29 

11 

60 

75 

30 

14 

61 

76 

31-32 

15 

62 

78 

33 

17 

63 

80 

34 

19 

64 

82 

35-36 

21 

65 

85 

37 

24 

66 

86 

38 

26 

67 

88 

39 

28 

68 

90 

40-41 

31 

69 

91 

42 

33 

70 

92 

43 

34 

71 

93 

44 

38 

72 

95 

45 

41 

73 

96 

46 

43 

74 

97 

47-48 

45 

75 

98 

49 

48 

